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importance of Relationship Between 
Various Body Measurements in 
Performance of the Toe- Touch Test 


MARION R. BROER 
University of Washington 
Seattle, Washington 
NAOMI R. G. GALLES 
Moscow, Idaho 
Abstract 


The primary purpose of this study was to investigate | nportance of the relationship 
of trunk-plus-arm length (reach) to leg length in the ity to perform the toe-touch 
test. Data were collected on 100 college women. Ver) anthropometric measurements, 
flexibility scores (Leighton flexometer), and totic es were obtained. Results 
indicate that the relationship of reach length to lea ‘1: « & is not an important factor 
in the performance of the toe-touch test for persone \\\\. ..wrage body builds, but that, 
for those with extreme body builds, a longer trunk4:'u ‘\ /m (reach) measurement in 
relation to shorter legs gives an advantage in the perfon. ce of this test. 


THE ABILITY to touch the toes has been quite ;enerally accepted as a 
“normal” accomplishment in physical examinations and in therapeutic exer- 
cise programs. The question as to whether this ability is “normal” for all 
ages and for all body builds has frequently arisen. The Kraus-Weber test 
(1) has evoked considerable discussion. The sixth item of this test, the flexi- 
bility item requiring the fingertips to be held touching the floor for three 
seconds, has caused considerable controversy. 

Various comments have indicated that many individuals question the 
validity of a single standard on such a test for all body builds. Mathews, 
Shaw, and Bohnen (4) recently reported a study of the relationship of 
reaching height, standing height, and leg length to hip flexibility of college 
women. The Kraus-Weber and the Wells Sit and Reach tests, as well as 
flexibility measurement taken with the Leighton flexometer, were used to 
measure hip flexibility. Their results indicated that there was no significant 
relationship between the three tests of flexibility and the three anthropometric 
measures. They did not however, study the relationship between reaching 
length and leg length nor between weight and height. 


Purpose of the Study 

It was the purpose of this study to determine the importance of the relation- 
ship of trunk-plus-arm length (reach) to leg length and of weight to height 
in the ability to perform the toe-touch test. 


Procedure 
The data were obtained from 100 University of Washington women students 
enrolled in regular physical education activity classes. These women ranged 
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from 18 to 31 years of age, though 91 per cent were between 18 and 22. 
Only three were over 24. 
Several anthropometric and two flexibility measurements were taken. 


ANTHROPOMETRIC MEASUREMENTS 


The examiner constructed two standard lineal scales on large cardboard 
poster paper, 28 by 22 inches, and taped them securely to the walls in one 
corner of the gymnasium so that the two scales were at right angles to each 
other. The scales were 102 inches high, with every half-inch drawn com- 
pletely across the paper in pencil, every inch drawn in red pencil, and every 
12 inches, or each foot, designated by a blue line. After being attached to 
the wall, these scales were very carefully checked with a standard scale to 
insure accuracy. All measurements were taken in bare or stocking feet. 


Standing height. With the student’s feet flat on the floor and her heels, hips, 
and head back against the wall, standing erect, chin level, her standing height 
was measured to the nearest one-fourth inch. The height measurements were 
taken on the right side of the corner wall, with a wooden clipboard placed 
on the student’s head. The clipboard was touching both walls parallel to the 
lines of the scales to insure correct measurement. 


Reaching height. The student stood facing the scale, feet flat on the floor 
and toes, chest, and forehead touching the wall. She then reached with both 
hands over her head as high as possible against the scale. The height of her 
fingertips was measured to the nearest one-fourth inch. 

Leg length. To determine the leg length, the examiner faced the student, 
placed her hands approximately four to six inches below the student’s waist 
on each hip and asked the student to swing her right leg back and forth 
slowly, and then to lift it to the outside. By manipulation, the examiner was 
able to locate the spot where the greater trochanter entered the pelvic girdle. 
The height of the greater trochanter from the floor was measured. This pro- 
cedure was followed twice with every student. 


Weight. All students were weighed without their shoes and dressed in their 
activity clothes (shirts and shorts). 


FLEXIBILITY 


The flexibility of the lower back and hip was measured by the toe-touch 
test and the Leighton Flexometer (2). 


Toe-touch test. The toe-tcuch test was performed on a gymnasium bench 14 
inches high. Two wooden yardsticks were securely fastened to either side of 
the front of the bench 16 inches apart. These yardsticks had been covered 
with paper on which the inch and half-inch markings were traced. The 
number 0 was placed level with the top of the bench and each inch above 
was numbered and marked minus while those below were marked plus. The 
total 36 inches of the yardsticks were used, from 0 to +14 inches below the 
bench, and 0 to —22 inches above the bench. 
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To measure the student’s downward reach, another wooden yardstick cross- 
ing the two-scales was held at the student’s fingertip level. The downward 
distance the student could attain when performing the toe-touch test could 
be read on either scale. 

As a safety precaution, the two yardsticks on either side of the bench 
were used rather than one in the middle of the bench. 

The position of the performer and the command given were similar to 
the ones used in the Kraus-Weber test (1). However, as the test was per- 
formed on a bezch, the student was told to reach down as far as she could 
with her fingertips. The examiner then demonstrated how the test should be 
performed. 

All students were given a pre-trial of the toe-touch test from the bench 
to determine whether they had any fear of falling. The examiner “spotted” 
the student during this preliminary trial and then asked if the student had 
any fear of falling that kept her from making a maximal effort. Ninety-nine 
students replied that they had no fear whatsoever, and only one answered 
that although she did not have any fear, she might have a subconscious 
feeling of restraint. 

Leighton flexometer measurement. This measurement of hip and lower back 
flexibility was taken at the same time that the student performed the toe- 
touch test. The flexometer was strapped to the student’s chest under the 
right armpit while the student rested her hands on her head. When the 
flexometer was snugly fastened, the student lifted her arms directly above her 
head. With both the dial and the pointer of the flexometer pointing to zero, 
the dial was locked and the student was given the command noted previously. 
While the maximum downward position was held for the count of three, 
the pointer of the flexometer was locked in place. The examiner then checked 
the measurement of the toe-touch test to make sure the yardstick was being 
held level at the performer’s fingertips. The student returned to a standing 
position, the flexometer was read, and the scores were recorded. The first 50 
students repeated the performance and both measurements were taken again. 


RELIABILITY 

Various authors have reported reliability coefficients which indicate that 
these measurements can be highly reliable. To check the accuracy of the 
examiner, all measurements were repeated on the first 50 subjects. Because of 
the difficulty of locating the center of the greater trochanter, the leg length 
measurement was repeated on all 100 subjects. Reliability coefficients were 
calculated, using the Pearson Product-Moment method with the scattergram 
technique. 
Analysis of Data 

The intercorrelation of variables and the differences in the ability to per- 
form the toe-touch test between “extreme” body types were the primary 


methods used in analyzing the data. 
The reach-to-leg ratio was determined by dividing the student’s reach 
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(trunk-plus-arm length) by the student’s leg length. Trunk-plus-arm length 
was determined by subtracting the student’s leg length from her reaching 
height. This measurement was called the “reach length” of each individual 
as she performed the toe-touch test. This relationship of reach length to leg 
length was the main factor investigated in this study. 

The weight-height ratio was determined by dividing the weight of the 
student by her standing height. 


The toe-touch test had been scored as the number of inches by which the 
student failed to touch her toes or by which she passed the level of her toes. 
This necessitated use of negative and positive scores. To facilitate tabulation 
and eliminate the minus signs, a scale of positive numbers was used. Since 
no one failed by more than ten inches to reach the level of her toes, the —10 
score was given the rating of zero. The score of —9.5 was rated one, —9 
was rated two, etc. The original zero score at the level of the toes received 
the rating of 20. Each half-inch lower than the toes was rated an additional 
point. Any number under 20 indicated that the student could not reach her 
toes, and any number over 20 indicated that the student could reach below 
her toes. 


The range of scores, mean, and standard deviation were found for each 
of the following measurements: standing height, leg length, reach-to-leg ratio, 
flexibility of the hip and back, the toe-touch test, weight, and the weight-height 
ratio. 

Using the Pearson Product-Moment method of correlation with the scatter- 
gram technique, all of the above variables were intercorrelated. These co- 
efficients were compared with those indicated by Lindquist (3) as significant 
at 5 percent and 1 percent level with a group this size (100). 

Six partial correlations were calculated. These included: 


. Toe-touch test with flexibility, reach-to-leg ratio held constant. 

. Toe-touch test with reach-to-leg ratio, flexibility held constant. 

. Toe-touch test with flexibility, leg length held constant. 

. Toe-touch test with leg length, flexibility held constant. 

. Toe-touch test with flexibility, weight-height ratio held constant. 
6. Toe-touch test with weight-height ratio, flexibility held constant. 


To determine whether “extremes” in body build might influence ability in 
the toe-touch test, a comparison of the mean toe-touch test scores of those 
students who had the longest reach in relation to leg length and those who 
had the shortest reach in relation to leg length, was made. 


The distribution of the reach-to-leg ratio scores was examined. The 30 
cases falling at the top of the distribution were put into the high group and 
the 30 at the bottom into the low group. The low group was, therefore, made 
up of students who had the longer legs and shorter reach, while the high 
group included those who had the shorter legs in relation to the length 
of their reach (trunk-plus-arm length). 
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The mean, standard deviation, and sigma of the mean for the performance 
of the toe-touch test were found for both the high and low groups. The sigma 
of the difference and the t were calculated to determine whether the difference 
in ability of these two body types to reach downward was significant. The 
Chi Square test was applied to the two toe-touch distributions to determine 
normality. 

To determine whether this difference might be due to flexibility rather 
than to body-build, the mean, standard deviation, sigma of the mean, sigma 
of the difference and the t were also calculated for the flexibility scores of 
the high and low groups to determine whether there was a significant differ- 
~ ence in the flexibility of these two groups. To determine normality of these 
two flexibility distributions the Chi Square test was applied to these data. 


This same analysis was carried out for the high and low groups according 
to weight-height index. The 30 students whose weights in relation to their 
heights were the greatest were placed in the high group, and those whose 
weights in relation to heights were the least were placed in the low group. The 
mean toe-touch and flexibility scores of these two groups were compared, 
and the differences were tested for significance. 


TABLE 1 


Reliability Coefficients for Two Hip Flexibility Tests and 
Three Anthropometric Measurements 





Test | 





Standing height 
Reaching height 
Leg length 
Toe-touch test 
Leighton flexometer 























Results 
RELIABILITY OF MEASUREMENTS 


The reliability coefficients for the two flexibility tests and the four anthro- 
pometric measurements were very high (see Table 1). 

While the reliability coefficient for leg length was the lowest, it was well 
above .9 and considerably higher than the objectivity coefficient of .84 
reported by Mathews (4, p. 354). His objectivity coefficient for the toe-touch 
test was .98, which compared favorably with the reliability of .97 obtained 
in this study. 

The reliability coefficient for the Leighton flexometer approximated closely 
the .99 reported by Leighton (2, p. 212). Mathews (4, p. 354) reported an 


objectivity coefficient of .88 for this measurement. 
The two height measurements were extremely reliable. 





258 The Research Quarterly, Vol. 29, No. 3 


RELATIONSHIP BETWEEN VARIABLES 


Intercorrelations. When the variables were intercorrelated, the coefficients 
were found to range from .00 (flexibility with standing height) to .89 (stand- 
ing height with leg length) (see Table 2). 

Using the table from Lindquist (3), it was found that, for a group of 
100, an r must be at least .197 to be significant at the 5 per cent level and 
.256 at the 1 per cent level of confidence. 

The relationship between the toe-touch test and flexibility was substantial 
(.81), and compared very favorably with the .80 reported by Mathews (4, 
p. 355). 


The correlation of .22 obtained between the toe-touch test and the reach- 


TABLE 2 


Intercorrelation Coefficients of Six Variables 
(N = 100) 


Variable #1 #2 #8 #4 #5 #6 


#0 — Toe-touch test _...| 81.02} .22+.06| .20+.06|-.14+.07|-.24+.06; .12+.07 
#1 — Flexibility _.__ 05~.07} .11+.07) .00+.07|-.07+.07;} .11+.07 
#2 — Reach-to-leg ratio __ —.09+ 07 | —.22+.06| —.49+.05 | —.16+.07 
#3 — Weight-height ratio - 32+ 06} .32+.06) .92+.01 
#4— Standing height __ 8901) .65+.04 
61+ .04 





























to-leg ratio was low, but, according to Lindquist (3), was significant at the 
5 per cent level. This low positive relationship indicates that the students 
with the longer trunks and arms in relation to shorter legs may have a slight 
advantage in the performance of the toe-touch test. This is further sub- 
stantiated by the —.24 coefficient between leg length and the toe-touch test 
which was just below the 1 per cent level of confidence. This negative cor- 
relation indicates that the students with the shorter legs may have some 
advantage in this test. 

The coefficient of correlation between the weight-height ratio and the toe- 
touch test was also significant at the 5 per cent level. This low positive 
correlation indicates that, on the whole, those with a higher index (more 
weight in relation to height) did somewhat better in the performance of the 
toe-touch test. They also tended to have slightly higher flexometer readings 
although this correlation (.11) was not significant. The correlations of weight 
and height with toe-touch test were not significant. 

The relationships between actual flexibility and reach-to-leg ratio, standing 
height, and leg length were almost zero. The only factors with which flexi- 
bility showed any correlation were those involving weight, and these were 
not significant. 

Mathews (4, p. 356) did not find any significant relationship between the 
flexibility tests and the anthropometric measures. A comparison of the co- 
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TABLE 3 
A Comparison of Mathews’ Correlations with Those of the Present Study 





Mathews’ Results Present Study 
Variable (N = 66) (N = 100) 
Toe-touch with leg length ——  ___ F — 24 
Flexibility with leg length ; — 07 
Toe-touch with standing height 5 —.14 
Flexibility with standing height —___. . .00 








TABLE 4 


A Comparison of Zero Order Correlations with Partial Correlations of Certain Variables 
with Other Variables Held Constant 





Zero 
Order 
Test (N= 100) r Variable Held Constant 
Toe-touch and flexibility 81 Reach-to-leg ratio held constant _ 

Toe-touch and reach-to-leg ratio 22 Flexibility held constant 
Toe-touch and flexibility 81 Leg length held constant _________ 
Toe-touch and leg length __..__.| —.24 Flexibility held constant 
Toe-touch and flexibility —___ $l Weight-height ratio held constant- 
Toe-touch and weight-height ratio _ .20 Flexibility held constant 

















efficients of correlation of the two studies was interesting (see Table 3). 

In contrast to the results of this study, Mathews found that both leg length 
ad standing height correlated higher with flexibility scores than with the 
toe-touch test. 

The correlation between trunk-arm length and the standing bobbing test 
(.297) reported by Scott and French (5, p. 184) closely approximates that 
found in this study between reach-to-leg ratio and the toe-touch test (.22). 

The high correlation of the weight-height ratio with weight was expected, 
as was the correlation of standing height with leg length. 

When the effect of certain variables was held constant, it was found that 
the relationships were not markedly changed (see Table 4). 

The largest differences were obtained when flexibility was held constant on 
two correlations, the toe-touch test with reach-to-leg ratio, and the toe-touch 
test with leg. length. With reach-to-leg ratio held consant, there was only 
a .0l increase in the correlation of toe-touch with flexibility, and no change 
at all with leg length or the weight-height ratio held constant. These results 
indicated that, if all participants in the test had the same degree of flexibiliy, 
reach-to-leg ratio and leg length would be slightly greater factors in ability 
to perform the toe-touch test than was indicated by the zero order correlations. 
It was also interesting to note that the correlation between the toe-touch test 
and flexibility did not change materially when any of the other three variables 
was held constant. 

Difference in Toe-Touch Test Scores of Various Body Builds. Examination 
of the distribution of reach-to-leg ratios indicated that the high and low 
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TABLE 5 
Frequency Distribution of Reach-to-Leg Ratios 





Frequency 


1 








PII P| OL BOK We NN! 





~ 
a 


eer a ee 


Total 


s 








groups were not widely different (see Table 5). One of the eight students 
with a ratio of 1.28 and three of those with a ratio of 1.21 were drawn at 
random and placed in the middle group. Since the difference between the 
lowest ratio of the high group and the highest ratio of the low group was 
only .07 and the majority of high group (two-thirds) had ratios below 
1.32 and all but three of the low group had ratios above 1.19, it would seem 
that this group did not contain many young women of very extreme body 
builds. 

When these high and low groups were compared, it was found that those 
with the longer reach in relation to leg length had a higher mean toe-touch 
test score than those with a shorter reach in relation to leg length and this 
difference was significant at the 3 per cent level of confidence (see Table 
6). The Chi Square results indicated that neither of these toe-touch distribu- 
tions differed significantly from a normal distribution. 


In actual inches, the mean for the high group was almost one inch below 
the level of the toes and the mean for the low group was approximately one 
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TABLE 6 
Significance of the Difference Between High and Low Reach-to-Leg Ratio Groups 





Sigma Sigma Level 
Group No. Mean Diff. Sigma | of Mean | of Diff. t of Signif. 





Toe-touch test 
High Group 21.67 6.57 1.20 
Low Group _. 17.83 nae 7.14 1.30 La at o% 
Flexibility 
High Group - 144.57 13.22 2.41 
Low Group 14273| 14 | 1220 | 223 | 323 | 





























inch above the level of the toes, or a mean difference of approximately two 
inches. It must be noted that two inches is represented by four points on the 
scale and that toe level equalled a score of 20. The range of scores for the 
high group was 111% inches, from minus 414 inches (above the toes) to 
plus 7 inches (below the toes). The range of scores for the low group was 
16 inches, from minus 9 inches (above the toes) to plus 7 inches (below the 
toes). It was interesting to note that the one low-group student who reached 
7 inches below her toes also had the largest degree of flexibility (168°). 
This student has also had several years of dance instruction. The next lowest 
downward reach that any of the students in the low group attained was three 
and one-half inches below the toes and again this was reached by only one 
student. Of the high group, there were seven students (approximately one- 
fourth) who reached three and one-half inches or more below their toes. 

There was no significant difference in the mean flexibility of these two 
groups. Again the Chi Square results indicated that the distributions were 
not significantly different from a normal distribution. 

The range of the flexibility scores was approximately the same for both, 
112° to 170° for the high group and 115° to 168° for the low group. The 
difference between the means was less than two degrees. 

These results indicated that those persons in the high group, those with a 
longer reach in relation to their shorter legs, definitely had an advantage in 
the performance of the toe-touch test over those students with a shorter reach 
in relation to leg length. The relationship of trunk-plus-arm length to leg 
length was a significant factor in the ability to perform the toe-touch test 
for those students with the “extreme” body types. In all probability it would 
be a greater factor if more extreme body types were measured. This difference 
between the groups was not due to a difference in flexibility, since no signifi- 
cant difference was found between the high and low groups in the flexibility 
scores, 

The study of the “extremes” in weight-height index again indicated that 
this group of students did not contain truly extreme body types. The total 
range of the weight-height indexes was only .90 and the lowest index of the 
“high” group was only .16 higher than the highest of the “low” group (2.05 
and 1.89). 
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TABLE 7 
Significance of the Difference Between High and Low Weight-Height Ratio Groups 





Sigma Sigma 
Group No. Mean Diff. Sigma of Mean of Diff. t 





Toe-touch test 
High Group — 30 22.2 7.04 1.29 
30 197 | 2» 6% | 16. | b | 1S 


Flexibility 
High Group — 30 
Low Group —. 30 142.6 


1478 12.86 2.35 


5.2 10.99 | 201 3.09 1.68 


























The “high” weight-height group had mean scores in both the toe-touch 
test and actual flexibility measurements which were higher than the “low” 
group but neither difference was significant (see Table 7). Although the 
significant but very low correlation coefficient indicated that there is some 
relationship between the weight-height ratio and ability to perform the toe- 
touch test, no conclusion can be drawn as to whether girls who are extremely 
heavy for their height would be handicapped to any extent in the perform- 
ance of the toe-touch test or in flexibility as measured by the flexometer. 
This study indicates that girls within this weight range were not handicapped; 
if anything, they did better. 


Conclusions 
Within the limitations of this study, the following conclusions seem justi- 


fied. 

1. The relationship of trunk-plus-arm length to leg length is not an 
important factor in the performance of the toe-touch test for those persons 
with average body builds. Therefore, the toe-touch test could be used as an 
indication of hip and back flexibility for the average body build. 

2. For those persons with extreme body types, the relationship of trunk- 
plus-arm length to leg length is a significant factor in the performance of 
the toe-touch test. Those persons with a longer trunk-plus-arm measurement 
and relatively short legs have an advantage in the performance of this test. 
Those persons with long legs and a relatively short trunk-plus-arm measure- 
ment are at a disadvantage in the performance of the toe-touch test. 


Recommendations for Further Study 


1, A study of a group which is known to include more extreme body types 
is needed. 

2. Since adolescent boys and young men are more likely to have long 
legs in relation to their trunks, a similar study of boys and/or men would 
be interesting. 

3. A study of the effect of specific practice of hip, leg, and lower back 
stretch exercises on the ability of the various body types to perform this 
test would be valuable. 
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Movement and Meaning: 
Development of a General Theory 
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Abstract 


A tentative general theory of the meaning of human movement-kinesthesia as a somatic- 
sensory experience which can be conceptualized by the human mind was developed 
within the context of the basic assumptions of the philosophy of symbolic transformation 
as they relate to the nature of the process which enables human beings to find meaning 
in their sensory perceptions. The essential elements common to all forms of human move- 
ment were identified. A vocabulary was developed to refer to these elements in their most 
general form. Using this vocabulary, the relationships among these elements were 
analyzed in relation to the process of human thought. From this analysis, a tentative 
general theory of the meaning inherent in human movement-kinesthesia was formulated. 
The intent to attempt to validate this theory in subsequent papers is stated. 


MOVEMENT is an essential element in all animal life. The feeling of mov- 
ing in space accompanies and guides every reaction an animal makes to the 


stimuli that constitute its sensory perception of its internal and external 
environment. This is equally true for man, but there is a significant dif- 
ference between animal and human movement, just as 


There is an unmistakable difference between organic [animal] reactions and human 
responses. In the first case a direct and immediate answer is given to an outward [or 
inward] stimulus; in the second case the answer is delayed. It is interrupted and 
retarded by a slow and complicated process of thought. (1, p. 43) 


Human movement differs from animal movement because man is able to 
think ebout his own movement. He can conceptualize his kinesthetic percep- 
tion of his own movements. And he can try to “make sense” out of these 
conceptualizations by philosophizing about them within the context of his 
own structure of human meanings and values. 

The somatic or structural aspects of movement have been studied by many 
investigators; but the significance of the human ability to conceptualize the 
sensory or perceptual aspects of movement has received little attention; and 
the questions relating to human meanings and values in this conceptualization 
of the structural-perceptual experience of movement-kinesthesia have scarcely 
been raised. The establishment of a coherent theory about the meanings in- 
herent in the psycho-sensori-somatic experience of moving as a human being 
could do much to clarify the significance of physical education as the form 
of education which is primarily concerned with human movement experiences. 


264 





Theory of Movement and Meaning 


Statement of the Problem 

The central problem of this study was the development of a tentative 
general theory about the meaning of human movement-kinesthesia as a 
somatic-sensory experience which can be conceptualized by the human mind. 
The logical solution of this problem was considered to be the first step in an 
attempt to develop a coherent philosophy of movement which identifies the 
meaning and value inherent in the basic human experience of moving and 
perceiving movement. 


Procedure 

The philosophical process of developing a tentative general theory which 
rested on identifiable basic assumptions and incorporated all observable 
aspects of the movement experience was implemented as follows: 1. A widely- 
accepted contemporary theory about the nature of the mental process which 
enables human beings to find meaning in their sensory perceptions was exam- 
ined in terms of its relevance to the understanding of movement-kinesthesia ; 
2. Within the context of this theory, the elements which appear to be involved 
in all forms of human movement were identified; 3. To enable investigators 
to refer to these elements in their most general forms, a vocabulary was 
developed; 4. Using this vocabulary, the relationships among the general 
elements identified in human movement-kinethesia were analyzed in relation 
to the process through which the human mind finds meaning in human experi- 
ences; and 5. Out of this analysis and synthesis, a tentative general theory 


of the meaning inherent in human movement-kinethesia was formulated. 


The Process of Human Thought 

For many years, it was customary to describe the brain as a “transmitter 
system” similar to a telephone exchange in which sensory messages were 
received and motor messages sent out. This analogy provided a convenient 
explanation for reflex behavior. It was expanded when Pavlov and others 
demonstrated that reflexes could be conditioned, and a given response could 
be elicited by a substitute stimulus which became a “sign” for the original 
stimulus. The theory of conditioned reflexes accounted for many facets of 
animal behavior, but it provided no adequate explanation for man’s ability 
to comprehend his own stimulus-response experiences and think about them 
in abstractions or ideas. Consideration of this unique ability “which appears 
to be the distinctive mark of human life” (1, p. 42) led to the conclusion 
that evolutionary development had produced a human mind which was sig- 
nificantly different from the animal brain. 


The functional circle of man is not only quantitatively enlarged; it has also under- 
gone a qualitative change. Man has, as it were, discovered a new method of adapting 
himself to his environment. . .. As compared with other animals, man lives not merely 
in a broader reality; he lives, so to speak, in a new dimension of reality. (1, pp. 42-43) 
This concept of “a new dimension” in the human mind supplied the clue 

to a new analogy which incorporated man’s unique ability to transform 
sensory perceptions into abstractions formulated as ideas. It is now recog- 
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nized that while the animal brain acts essentially as a “transmitter,” the 
human mind is better likened to a “transformer” in which sensory perceptions 
undergo a fundamental change of character during the process of stimulus 
transmission. The philosophy of symbolic transformation, developed initially 
by Emest Cassirer (1) and later by Susanne K. Langer (4) and others, 
incorporates this concept of the transforming power of the human mind. 


Ideas are undoubtedly made out of impressions—out of sense messages from the 
special organs of perception, and vague visceral reports of feeling. . . . The material 
furnished by our senses is constantly wrought into symbols, which are our elementary 
ideas. . . . For the brain is not merely a great transmitter, a super-switchboard; it is 
better likened to a great transformer. The current of experience which passes through 
it undergoes a change of character, not through the agency of the sense by which the 
perception entered, but by virtue of a primary use which is made of it immediately: 
it is sucked into the stream of symbols which constitutes a human mind. . . . It is 
only when we penetrate into the varieties of symbolific activity . . . that we begin 
to see why human beings do not act as superintelligent cats, dogs, or apes would act. 
(4, pp. 33-34) 


The human capacity for transforming sensory perceptions into symbols was 
emphasized by Ittelson and Cantril in their recent review of studies related 
to the nature of perception and sensation. 


In man the receiving of symbolic messages is undoubtedly one of the most important 
functions of perception. In lower animals it can be observed, if at all, only in a most 
primitive and stereotyped way. . . . In studying human perception, we have con- 
stantly to bear in mind that it is impossible to have any perception which is devoid of 
symbolic content. Furthermore, this symbolic content is not some excess baggage added 
to the perception, but is an integral and inseparable part of it. (2, pp. 19-20) 


This capacity for dealing with abstractions which symbolize or represent 
his sensory perceptions is the basis of man’s ability to find meaning in his 
life as a human being. 


. .. That symbolic thought and symbolic behavior are among the most characteristic 
features of human life, and that the whole progress of human culture is based on 
these conditions, is undeniable. . . . Hence, instead of defining man as an animal 
rationale, we should define him as an animal symbolicum. By so doing we can desig- 
nate his specific difference, and we can understand the new way open to man—the 
way to civilization. (1, pp. 44-45) 

Speech, a uniquely human development which is shared by no other branch 
of the animal kingdom, is prime evidence of man’s capacity for symbolic 
transformation of sensory perception (1, pp. 142-175; 4, pp. 83-115). Sounds 
produced by vibrations of the vocal cords are transformed into words, which 
are symbols for concepts or meanings. Using these convenient abstractions 
of experience, the human mind is able to grasp, retain, and express ideas 
in a logical structure of discourse called language. 


The main lines of logical structure in all meaning-relations are . . . : the correlation 
of signs with their meanings by a selective mental process; the correlation of symbols 
with concepts and concepts with things, . . . ; and the assignment of elaborately 
patterned symbols to certain analogues in experience, the basis of all interpretation 
and thought. These are essentially the relationships we use in weaving the intricate 
web of meaning which is the real fabric of human life. (4, p. 63) 
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In recent years the implications of the philosophy of symbolic transforma- 
tion for some of the “wordless” or non-discursive areas of human compre- 
hension and expression have been tentatively examined, with specific refer- 
ence to music, the graphic arts, and dance (3, 5). Essentially, it now appears 
evident that animal symbolicum transforms sensa into many different kinds 
of conceptual symbols, not all of which can be translated inte words. A 
musical composition, for example, is a symbolic formulation of concepts in 
non-verbal sounds. A picture can never be completely described in words; 
it conveys its meaning in non-discursive visual symbols. Similarly, dance 
is recognized as a non-discursive art form which symbolifies concepts in 
movement. 


The recognition of presentation [non-discursive] symbolism as a normal and preva- 
lent vehicle of meaning widens our conception of rationality far beyond the traditional 
boundaries, yet never breaks faith with logic in the strictest sense. Wherever a symbol 
operates, there is a meaning: . . . No symbol is exempt from the office of logical 
formulation, of conceptualizing what it conveys; however simple its import, or however 
great, this import is a meaning, and therefore an element for understanding. Such 
reflection invites one to tackle anew, and with entirely different expectations, the whole 
problem of the limits of reason, the much-disputed life of feeling, and the great con- 
troversial topics of fact and truth, knowledge and wisdom, science and art. (4, pp. 
78-9) 

The symbolic transformation of movement-kinesthesia has been explicitly 
recognized in dance as an art form, even as the symbolic nature of language 
is clearly identified in the language art form called poetry. But the symbolism 
of poetic language is only an extension of the first crude transformation of 
sounds into conceptual symbols which distinguish “the ape with the lalling 
instinct” from all other animals. So it would seem that recognition of the 
symbolic import of movement in the dancer’s art would imply a fundamental 
human capacity to transform movement-kinesthesia into meaningful non- 
discursive conceptual symbols. This line of reasoning suggests that the phi- 
losophy of symbolic transformation may provide the key to whatever mean- 
ing is inherent in movement-kinesthesia as man’s most persistent sensory 
experience. 


The Problem of Vocabulary 


The current terminology of movement-kinesthesia is characterized by great 
diversity. In general, it has been developed by specialists concerned with 
specific aspects of movement. Anatomists, physiologists, neurologists, and 
orthopedists have studied structure and function; kinesiologists have identified 
principle of mechanics and dynamics; and those interested primarily in 
sports, dance, work, or therapeutics have all developed special terminologies. 
This concern with specifics has created a kinesiological Tower of Babel 
inhabited by specialists speaking in different tongues, unable to communicate 
adequately with each other about the general nature of the phenomena of 
movement and kinesthesia with which they are all dealing. Since “the whole 
purpose of general concepts is to make the distinction between special cases 
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clear,” (4, p. 43) any attempt to explore the general nature of movement- 
kinesthesia as a basic human experience with a vocabulary so contaminated 
by specific connotations can only add to the confusion. 

A general vocabulary which identifies the elements common to all forms 
of movement is prerequisite to the development of a general theory of the 
meaning of movement-kinesthesia as a basic human experience. Such a vo- 
cabulary is presented below. It has been developed by logical analysis of 
the elements which are inherent in all forms of movement. 

Considered in their most generalized form, the elements which appear to 
be common to all human movement may be classified as: 

Structural— A dynamic somatic pattern is constructed by the changing positional 

relationships of the body masses. 

Perceptual— The dynamic structure of this somatic pattern is perceived by the kines- 
thetic sensorium. 

Conceptual—This dynamic somatic pattern has some significance as a response made 
by a human being to his sensory perception of external and/or internal 
environmental stimuli. 

Words referring to these three general aspects of human movement have been 
developed by combining the root of the Greek word kinein (to move) with 
general word forms which identify the concepts of structure, sensory percep- 
tion, and conceptualization. In the definitions which follow, the word “form” 
is used in its most general meaning as referring to a formulation of character- 
istics and the relationships among them which give an event its unique 
identity. 

Kinestruct: n. A dynamic somatic form constructed by body masses in motion. 

Kinestructure: v. To create a kinestruct. 
Kinescept: n. A sensory form created by kinesthetic perception of a kinestruct. 
Kinesceptualize: v. To perceive a kinescept. 

Kinesymbol: n. A conceptual form which is an abstraction of the significance or import 

of a kinestruct and its kinescept within the socio-psycho-somatic con- 


text of a situation. 
Kinesymbolize: v. To conceptualize the import of a kinestruct-kinescept. 


These words are used in the discussion which follows. 


The Nature of the Kinestruct 


A kinestruct can never be described in detail because it is a dynamic form 
compounded out of continuous changes in tension in every muscle fiber of 
the body. (For example, “He raised his arm” describes only the gross nature 
of one aspect of a kinestruct. How did he raise his arm? What positions 
were assumed by the rest of his body? How were these positions altered by 
the raising of his arm? What changes occurred in the tension of the muscle 
fibers in his back? In his legs? In his neck? In what ways was the raising 
of his arm related to the total situation in which it occurred?) These uncount- 
able changes in muscle tension which are synchronized into “a movement” 
are never random or spontaneous. They occur as a total response to the 
total situation in which the movement occurs. 
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The Nature of the Kinescept 

Neither can the kinescept which corresponds to the kinestruct be described 
in detail. It is felt as a composite of the sensa transmitted by the entire 
kinesthetic sensorium as well as the “vague visceral reports of feeling” arising 
from continuous changes in homeostasis as the kinestructural response to the 
total situation is formulated. 

The interaction among the muscular, neural, and biochemical aspects of 
movement is continuous, operating as a complex “feedback” system. The 
countless minute changes in the kinescept as the movement progresses are 
translated into motor-nerve stimuli which produce changes in muscle tension ; 
these changes, in turn, provide new sensory information which modifies the 
kinescept and is utilized in turn to modify the motor-nerve stimuli, thus co- 
ordinating the kinestruct as it progresses in relation to the stimulus situation 
which initiated it. Through this interaction of situation-kinescept-kinestruct, 
the co-ordinated response of the person to his personal interpretation of the 
stimulus situation is kinestructuralized. 

The kinescept provides a sensory record of the kinestructural response of 
the person to the situation even while it is controlling or guiding that re- 
sponse. This sensory record may or may not be consciously identified by the 
person, but it is always present. (The effect of obliterating one part of it can 
be illustrated by the difficulty of walking when the “foot has gone to sleep” 
and the local proprioceptors are deadened.) At times, the kinescept may be 
conceptualized at the cortical level of awareness, as it is when the person tries 
to “get the feel” of a movement pattern, or when he tries to recall “the feel 
of a movement” which he has previously performed. At other times, espe- 
cially when the kinestruct is a familiar one, the kinesthetic feed-back may 
operate primarily at the cerebellar level, continuing to guide the kinestruct 
while the mover’s conscious attention is focused on something else. In reflex 
movement, the feedback may operate primarily at the spinal level. But at 
whatever level neuromuscular interaction is effected, the kinescept of the 
kinestruct is always present, because’ without it co-ordinated movement is 
impossible (Cf. 6, pp. 151-190). 

This sensory perception of the “feel of a movement” can never be satisfac- 
torily described in words. Just as a sound must be heard, as a color must 
be seen, so a kinescept must be felt to be identified. It can be comprehended 
only in its own unique identity as a kinesihetically perceived and therefore 
non-discursive perceptual form. 

The Nature of the Kinesymbol 

A kinescept can be conceptualized as a unique perceptual “form” which 
conveys unique sensory.information about one aspect of a person’s relation- 
ship to the world. Since the human mind conceptualizes perceptions by trans- 
forming them into abstractions which serve as symbols of the meaning that 
the perceptions had to the person, it follows that kinescepts must also be 
subjected to this transforming process, becoming abstractions which are 
symbols of the meaning of the movement as perceived. 
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This conceptualization of kinesthetic perception cannot be expressed in 
the symbols of any other sensory conceptualizations; it is not verbal, visual, 
auditory, or anything else but what it is—a conceptualization of kinesthetic 
perception. It is a kinesymbol, an abstraction of a kinesthetic experience 
which contains its own human meaning in its own kinesthetically perceived 
form. This meaning may not be consciously recognized; it may be vague, 
fragmentary, or transient; it may be definite, organized, and long-lasting. It 
may be as functional as the meaning of locomotion or as non-utilitarian as 
“standing on your head to see if you can.” But every kinestruct and its 
kinescept is a kinesymbolic formulation of personal experience which adds 
one more trace of meaning to a human life. 

The characteristics which make kinescepts and kinestructs peculiarly well 
adapted to symbolic transformation are illuminated by Langer’s discussion 
of the suitability of vocables for utilization as symbols. 


. . » The little vocal noises out of which we make our words are extremely easy to 
produce in all sorts of subtle variations, and easy to perceive and distinguish. .. . 
Not only does speech cost little effort, but above all it requires no instrument save the 
vocal apparatus and the auditory organs which, normally, we all carry about as part 
of our very selves; so words are naturally available symbols as well as very economical 
ones. . . . Vocables in themselves are so worthless that we cease to be aware of their 
physical presence at all, and become conscious only of their connotations, denotations, 
of our very selves; so worcs are naturally available symbols as well as very economical 
merely to accompany them. . . . They fail to impress us as “experiences” in their own 
right unless we have difficulty in using them as words as we do with a foreign 
language or a technical jargon until we have mastered it. But the greatest virtue of 
verbal symbols is, probably, their tremendous readiness to enter into combinations. 
There is practically no limit to the selections and arrangements we can make of them. 
. .. Herein lies the power of language to embody concepts not only of things, but of 
things in combination, or situations. (4, p. 61-62) 


This paragraph might be paraphrased into a description of kinestructs and 
kinescepts somewhat as follows: 


The little changes in muscle tension out of which we make our kinestructs are 
extremely easy to produce in all sorts of subtle variations, and easy to perceive and 
distinguish. They require no instrument save the physical apparatus which is a part 
of ourselves; so kinestructs and kinescepts are naturally available as symbols as well 
as very economical ones. These kinestructs and kinescepts are so much a part of our 
lives that we cease to be aware of their physical presence, and become conscious only 
of their connotations. Our human activities seem to flow through them rather than to 
be identified with them. They fail to impress us as “experiences” in their own right 
unless we have difficulty in comprehending or performing a new kinestruct, as we do 
with a new “skill” until we have mastered it. But the greatest virtue of kinestructural- 
kinesceptual symbols is their tremendous readiness to enter into combinations. There 
is practically no limit to the selections and arrangements we can make of them. Herein 
lies the power of movement-kinesthesia to embody concepts not only of things, but of 
things in combinations, or situations. 


As a person moves in many different situations, a given kinescept may 
acquire many encrustations of meaning derived from the intellectual-emo- 
tional responses associated with those situations, in time becoming a very 
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complex symbol which can “stand for” the total meaning or import of the 
situations in which it was experienced. For example, the “feel” of a golf 
swing is experienced within the context of the total meaning of the game 
situation in which the club is swung. This context includes the mover’s sub- 
conscious and non-conscious sensori-emotional reactions to the meaning the 
situation has for him as well as his conscious intellection about it. Like other 
identifiable non-discursive perceptual forms, such as sight or sound, the 
kinescept is an integral part of the mover’s total sensory record of the situa- 
tion. Accordingly, if it is abstracted from the situational context, it can 
represent or “stand for” that context in the same way that a picture or a song 
can “stand for” a total experience and “bring back” its complex sensori- 
emotional-intellectual connotations at some later date. In short, a kinescept 
may serve not only as a discrete kinesymbol of the movement experience, 
as such, but also as a kinesymbol of the total import of a situation in which 
it has been experienced. Since all kinescepts are perceived within the context 
of a total situation, it seems probable that the meaning of all kinesymbols 
tends to become very complex. 

The kinesc pts of similar kinestructs may thus have very different emo- 
tional-intellectual import as kinesymbols for different people, depending upon 
the meanings, both obvious and subtle, in the situations in which they were 
experienced. (For example, the kinesymbolic import of the kinescept of 
“acute flexion of the thigh on the hip joint” may be quite different for a 
football player and a ballet dancer because the meanings and connotations 
in the situations in which their high kicks have been executed are not 
analogous.) In time, a kinescept may accumulate many residual connotations, 
some of which may be mutually contradictory, and the performance of a 
given kinestruct may elicit conflicting emotions. 

It may be noted that habitual postural kinestructs have long been recog- 
nized as kinesymbolic expressions of personality which reflect the influence 
of subconscious drives, motivations, and interpretations of self. It seems 
probable that the kinesymbolic meaning of all habitual kinestructs involves 
similar emotional components derived from sub-conscious associations with 
other elements in the person’s life experiences. 


A Tentative General Theory 


The distinguishing characteristic of human mentality has been identified 
as the ability to transform sensory perceptions into abstractions or symbols. 
It has been shown that the sensory perception of movement, called kinesthesia, 
is susceptible to such symbolic transformation. Kinesthesia may thus be 
identified as a component of human mentality. 

The experiences of moving as a human being has been analyzed, and 
three distinct but interrelated forms have been identified: 1. A structural 
form called a kinestruct; 2. A perceptual form called a kinescept; 3. A con- 
ceptual form called a kinesymbol. The nature of these forms and their inter- 
relationships provide the basis for a tentative general theory of the meaning 
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of human movement-kinesthesia : 
A kinestruct is the non-discursive kinesymbolic 
expression of the import of its kinescept. 


Discussion 


The validity of this general theory must now be tested by determining to 
what extent it seems to account for observable manifestations of the phe- 
nomena to which it refers. Obviously, such an extended process is beyond 
the scope of this preliminary paper. A few illustrations, however, may sug- 
gest the approach. 

How does a person comprehend or kinesceptualize a kinestruct created 
by another person? How does he create a similar kinestruct out of this 
first kinesception? What new dimension is added to the problem of “motor 
learning” by consideration of the kinesymbolism of the kinescept-kinestruct? 
Do the differences in kinesymbolism of “acute flexion of the thigh on the 
hip joint” provide an explanation for the mutual difficulty the athlete and 
the dancer may experience in attempting to perform each other’s version of 
a similar kinestruct-kinescept? Exploration of the kinesymbolic import of 
habitual postural kinescepts may suggest ways to lessen the difficulties usually 
encountered in trying to “teach posture,” i.e., establish a new postural kine- 
struct-kinescept. 

Perhaps the perennially bothersome question of defining “quality of 
movement” may be tentatively answered by recognizing that the kinestruct 
and its kinescept are both kinesymbols, and that subtle differences in “qual- 
ity” of similar kinestructs represent the subtle differences in their import as 
kinesymbols for the two performers. 

These few examples suggest approaches to many problems related to the 
kinesymbolic kinescepts-kinestructs incorporated in the physical education 
program, but these must await further investigation. It is our belief, however, 
that the philosophy of movement which may be developed out of investigation 
of such problems can give new significance to “physical education” by 
illuminating the meanings and values inherent in movement-kinesthesia as 
one facet of man’s ability to understand himself and the world in which he 
lives. 
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Abstract 


It was the purpose of this study to compare performances of fourth grade children 
on the Kraus-Weber test to those on the California Physical Performance Test. The 
results show that children who fail one strength item or any two or more items on the 
K-W test make lower scores on the average in running, jumping, throwing and sit-ups 
than do those who pass all K-W items. This difference is significant in all events for 
boys but only in the throw for girls. 

In addition, race and sex differences in K-W tests were investigated. 


THE KRAUS-WEBER test of minimum muscular fitness (8, 9) has recently 
received widespread attention. This six-item battery was designed for clinical 
examination of patients suffering low back pain. Kraus (9) states that 80 
per cent of the 4000 cases studied were found free from organic disease and 
that this group failed to pass one or more of the six items. Furthermore, 
permanency of relief was observed to run parallel with the muscular status. 
He goes on to report that “patients whose physical fitness level fell below 
these minimum requirements appeared to be ‘sick’ people, individuals who 
bore all the earmarks of constant strain and who frequently manifested signs 
of emotional instability” (9, p. 182). No figures are reported. 

The K-W test items were found to be highly reliable by Phillips et al. (12) 
when the first administration of the test was compared with a second 5 to 15 
minutes later. The effect of slight warm-up on the flexibility measure has 
been shown to be considerable, however (3). The diagnostic value of an 
item which can be passed minutes after a recorded failure is open to question. 

Just what is actually measured by the K-W test remains unsettled. Its 
validation as a “minimum muscular fitness” test for adults has not been 
supported by objective evidence. The poorer performance of United States 
children in comparison with European children has been widely publicized. 
Recent investigators in the United States (5, 7, 12) have obtained results 
similar to, although somewhat better than, those reported by Kraus. No addi- 
tional studies on Europeans have been published. An Iowa study (5) at- 
tempted to relate K-W performance to emotional instability in children but 
found' no significant relationship. An Indiana study (12) found no reliable 
difference in grip strength in elementary school children who passed and in 
those who failed the K-W test. 

A great deal of research has been done over a period of years on physical 
abilities related to success in activities of the physical education program (2, 
11). Since the acquisition of skills and participation in active games is rec- 
ognized as an important developmental task (6) of later childhood, tests of 
gross motor abilities are measures of this aspect of fitness. 
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The only batteries to measure motor ability validated against actual per- 
formances of elementary school boys and girls are those of McCloy (10). 
He has shown that a small number of track and field events plus a strength 
test predict fairly well achievement in a large number of tests. The Cali- 
fornia Physical Performance Tests of the California Project on Fitness (4) 
are almost identical with those of the McCloy general motor achievement 
battery. Events recommended for elementary school children include the 50- 
yard dash, standing broad jump, softball throw for distance, sit-ups, and 
modified pull-ups or push-ups. To date, no attempt has been made to 
combine scores. 


Purpose 


It was the purpose of this study to compare performances of fourth grade 
children on the K-W test to those on the California Physical Performance 
Test. In addition, race and sex differences on the K-W test were investigated. 


Procedure and Results 


Subjects for the study were children enrolled in seven of the 14 elementary 
schools of Berkeley, California. Classes tested included some in which the 
enrollment was predominantely white, some predominately Negro and the 
remainder fairly eveniy mixed. Although schools selected were located in 
various parts of town, the population studied is not typical of the city as a 
whole. 


The K-W tests were given by the author assisted by a graduate student. 
Results obtained in this study on each item and for the test as a whole are 
shown in Table 1. Several investigators have reported results by age and sex 
and these have been included in the talk for comparative purposes. Although 
the limits used for age in these studies has not been specified, it is assumed 
that a nine-year-old child, for example, is one who has passed the ninth birth- 
day but has not yet reached the 10th. Grouped in this way, there were 
29 8-year-olds, 228 9-year-olds, and 37 10-year-olds in the present study. 
Chi-square computed between those under 9.5 years and those over who 
passed the K-W test was not significant for either boys or girls, so all results 
are included in Table 1. 

The California Physical Performance Tests were given by the classroom 
teachers, often with the assistance of the City Supervisor of Physical Educa- 
tion.t No scores are reported for modified push-ups, because so few records 
were available. Results for both boys and girls were placed in three groups 
according to their K-W test scores. The first group was made up of those 
who passed all K-W items; the second, of those who failed in flexibility only; 
and the third, of those who failed either one strength item or two or more 
items. These results are presented in Table 2. 


1 The author is most appreciative of the co-operation of the Berkeley Public Schools 
and especially of Mr. Stanley Friese, supervisor of physical education. 
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TABLE 1 


Kraus-Weber Test Results from Various Studies: 
Percentage Failure of Nine-Year-Olds 





Oregon (7) Indiana (12) This Study 
Eastern U.S. (9) . 
Boys and Girls Boys | Girls Boys | Girls Boys | Girls 
K-W Items? 
No. of cases See footnote # 100 100 88 103 161 133 


ESS SEE 4.5 3.8 6 3.7 
Rae ESE 12.5 13.6 8.1 105 
ke oe 2.3 19 2.5 15 
| ae canner eF 0 0 0 0 
NSS Pease Soe 0 1.0 0 0 
Sains 51.1 23.3 12 


Failure (1 or more) 54.0 40.0 34.0 55.7 28.2 24.9 





























11 8 designed to measure abdominal and/or psoas strength. 
4, 5 designed to measure back strength. 

6 designed to measure flexibility. 

24264 boys and girls 6-16 inclusive. Number of 9-year-olds unstated. 


2 
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TABLE 2 


Results on California Physical Performance Tests According to 
Kraus-Weber Test Records 








Standing Broad 
Jump i 50 yd. Dash 
(Feet) (Seconde) 


Mean | Sigma | Mean | Sigma Mean | Sigma 


RE EE | = 8l 13.2 8.2 mf 
Fail: Flexibility | 4.1 9 77 8.2 8.7 6 
Fail: One strength 
or multiple 3.7 7 63 6.7 9.6 | 5.3 
City median Class A*} 4.7 73 8.5 12 
Girls 
Rien rane Was | d 45 : 8.9 11 9.4 
Fail: Flexibility _.| 3.9 J 46 . 8.9 9 7.3 


Fail: One strength 
or multiple | 4.1 36 . 9.4 8 38 


City median ClassA*| 4.3 36 88 | 10 






































1Qhildren are classified by age, height and weight. (1). 


Differences for boys between those who pass the K-W and those who fail 
one strength or multiple items are significant at the 1 per cent level in all 
events. Differences between boys passing the K-W and failing flexibility only 
are significant in the dash (5% level) and the sit-up (1% level). Among 
girls, those who pass the K-W are significan:ly superior to those who fail 
one strength item or multiple items in the throw for distance only. 
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TABLE 3 


Percentage of Fourth Grade Pupils Passing or Failing the K-W Test 
According to Race and Sex 





% Fail strength 
Fourth Grade % Fail or 
Pupils i % Pass flexibility multiple failure 





Boys 
MID is nas 52 36 ll 
Negro » 68° 24 8 
Cijental> 2.05. 71 28 0 

Girls 
Wente a 73 71 14 15 
MeO: 2 49 81 12 6 
Oriental 11 82 0 18 

















* Significantly greater than percentage of white boys passing. No other differences are signifi- 
cant. 


The percentage of Negro boys passing all items is significantly greater 
(5% level) than that of white boys. The same is true (1% level) for both 
Negro and white girls in comparison with white boys. If the flexibility item 
is omitted, there are no significant race or sex differences in performance 
(Table 3). 

Discussion — 

The results show a positive relationship between K-W scores and quality 
of performances of boys in running, jumping, throwing, and sit-ups. Strength 
and to some extent flexibility are factors common to both types of tests. 
Thus, the K-W test has some validity as an indicator of motor abilities in 
elementary school boys. There is little relationship shown in the data for 
girls, however. 

The California Physical Performance Test has a number of advantages 
over the K-W as a measure of fitness of elementary school children. It pro- 
vides a direct measure of the natural activities of children—running, jump- 
ing, throwing, and climbing. Scoring is on a continuum basis rather than 
pass or fail, permitting evaluation of performance in relation to the group 
or norm as well as measurement of progress from time to time. The events 
are challenging and the results meaningful to every child. Group testing is 
practicable. 

These findings suggest that flexibility as measured by bending forward and 
touching fingertips to the floor, keeping knees straight, is important in various 
activities. 


Conclusions 


1. Children who fail one strength or multiple items on the K-W test 
make lower scores on the average in running, jumping, throwing, and sit-ups 
than do those who pass all K-W items. The difference is significant in all 
events for boys but only in the throw for girls. 
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2. Boys who fail the K-W flexibility item make significantly lower scores 
on dash and sit-up than do boys who pass all items. 

3. Significantly more Negro boys and both Negro and white girls pass 
the K-W test than do white boys. 

4. If the flexibility item in the K-W test is omitted, no significant sex or 
race differences in performance are found. 
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Abstract 


This article seeks to clarify the concept of test reliability and to differentiate between 
the theoretical definition and the methods of estimation. In the first part, the split-halves 
and test-retest methods are critically examined in the context of a typical skills test. The 
broader and more comprehensive definition of error under the test-retest method is noted. 
In the second part, the use of analysis of variance techniques in reliability studies is illus- 
trated by application to a badminton wall volley test. The advantages of this approach 
over traditional approaches is discussed and possible applications cited. 


Part I. Discussion of the Split-Halves and Test-Retest Techniques 


The estimation and interpretation of the reliability of physical education 
skill tests has long presented problems to both the developers and users of 
these tests. In estimating reliability, the test maker has been forced to choose 
from among many formulas, without a clear understanding of the assump- 
tions which underlie these formulas. Each is purported to give an estimate of 
the reliability of the test, yet the values which they yield are often far from 
equivalent. Confronted with such a dilemma, the test maker often computes 
and reports several reliability coefficients, one by each of the more common 
methods of estimation. More often than not, such a solution serves only to in- 
crease the confusion of the test user. Rarely does he possess sufficient sta- 
tistical sophistication to grasp the full implications of the differences between 
the various estimates. Without some helpful discussion by the test maker, 
the test user is left with the question, “Just how reliable is this test?” 


Reliability Concept 


Confusion with respect to the reliability of skill tests has arisen in large 
measure because of a faulty understanding of the basic definition of the 
reliability concept. Too much emphasis has been placed upon the methods 
of estimation and too little upon the entity being estimated. It is the purpose 
of this paper to examine the concept of reliability and to point out some of the 
more subtle implications underlying the traditional methods by which it is 
estimated. 

Any discussion of reliability must necessarily proceed from the definition 
of three terms: obtained score, true score, and error score. The first of these, 
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obtained score, is the numerical value which actually arises when an examinee 
takes a tesi. ii is the value ordinarily conceived to be the “score” on a test; 
it is the only one of the three scores whose exact value is actually known. The 
terms “true score” and “error score” are less easily defined. One may think of 
a true score for subject A as the mean of a hypothetical, infinite series of 
measurements on that subject, each of the measurements being independent of 
the others and all being taken under the same conditions. While two obtained 
scores for A may vary, his true score for a given set of conditions is constant. 
The fact that obtained scores may vary while the true score remains constant 
leads to the definition of “error score.” An error score is conceived as the 
difference (positive or negative) between an obtained score and the corre- 
sponding true score. The error score is a function of many factors—both 
internal and external—which cause the examinee to get a better score on one 
testing than on another. In theory (and only in theory) it is possible to parti- 
tion A’s observed score into two parts: a constant portion—the true com- 
ponent—and a variable or chance portion—the error component. 

From the definitions of the three types of scores, the following relationship 
may be derived: 

Obtained score = True score + Error score. 

In test theory, the assumption is made that the error component of the obtained 
score is independent of the true component. This may be interpreted to mean 
that large error scores do not tend to be associated with high true scores nor 
small error scores with low true scores. In effect, error scores of any magni- 
tude can and do occur in conjunction with true scores of any magnitude.! 
From this assumption of independence, an equality of fundamental importance 
follows: 

Variance of obtained scores = Variance of true scores + Variance of error scores. 
The variances referred to in this equality are those which correspond to the 
distribution of scores for an infinite population of examinees. The proof of 
this relationship will not be presented here, but it may be easily derived 
from the expansion of the binomial (t + e)?. 

The concept of reliability is defined in terms of the variance of error scores 
and the variance of true scores. It is, in fact, defined as the ratio of the 
variance of true scores to the variance of the obtained scores. For simplicity, 
this definition may be written in the mathematical form of a ratio, as follows: 


Variance of true scores 





Reliability = i 
Variance of obtained scores 


The above relationship may never be computed directly, since the true 
scores for a sample of examinees are never known. Nevertheless, the impor- 


1[n many cases of actual measurement, there is some justification for doubting the tena- 
bility of this assumption. In such cases the theory is invalidated to the extent that the 
assumption is violated. However, even in such cases, useful conclusions may be drawn. 
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tance of this definition cannot be minimized, for all reliability formulas yield 
estimates of the value of such a ratio. It should not be inferred, however, 
that all reliability formulas represent estimates of the same theoretical ratio. 
It is over this point that so much of the confusion in the estimation of test 
reliability has developed. For, before any estimate may be made of the vari- 
ance ratio, the investigator must identify which factors, of the many which 
influence the obtained score, are to be counted as contributing to the error 
variance and which to the true variance. 

An example may be helpful at this point. A skill test in bowling might con- 
sist of one or several complete games bowled on one or more days (9). It 
is well known that bowlers exhibit definite day-to-day variation in their per- 
formance. Some of the individual differences between bowlers which can be 
observed in any testing session are due to this “good-day, bad-day” factor. 
Given two bowlers of equal skill, one may be “hot” and achieve relatively 
high scores and the other may be “cold” and achieve relatively low scores. 
The over-all effect of this factor is to add to the variability of the scores 
for the group as a whole and to magnify the individual differences which 
already exist. Should variations within an individual from one day to the 
next due to this “hot-cold” factor be regarded as part of the error component 
or part of the true component of the obtained score? To this question there 
is no unequivocal answer. Variation that is attributable to this source might 
be counted as either true variance or error variance, depending upon our 
purpose. The fact worth noting is that the investigator who counts such a 
source as contributing to the true variance will (or should) obtain a different 
estimate of reliability from the investigator who counts it as a source of error 
variance. 

This example illustrates a fundamental aspect of the concept of reliability. 
There is no single index which may be rightfully called the reliability of the 
test. There are, in fact, as many reliabilities as there are ways in which one 
may rationally partition the variance of the obtained scores into true and error 
components. Herein also lies the most fundamental difference among the vari- 
ous traditional methods of estimating reliability. While Method A consigns 
factors X and Y to the realm of true variance and factor Z to error, Method B 
may relegate only X to the true component and both Y and Z to error. Should 
factor Y contribute a significant proportion of the observed variance, the 
two methods would naturally result in widely divergent estimates of reliability. 

Unfortunately, the typical textbook in educational or psychological measure- 
ment does not approach the concept of reliability from its variance definition. 
More often, reliability is defined in terms of a particular method of estimation. 
The limitations of such a definition should be obvious. It precludes any dis- 
cussion of the possible alternative assignment of an identifiable factor to either 
the error or true component of the obtained score. Moreover, such an ap- 
proach obscures the fact that each of the traditional methods of estimation 
automatically consigns certain classes of factors to the true component. Since 
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this automatic assignment is rarely stated in a precise fashion, neither the 
test user nor the test maker may be aware that the method employed is entirely 
unsuited to his purpose. 


Methods of Estimating Reliability 

In the light of the foregoing paragraphs, two traditional methods commonly 
employed in the estimation of the reliability of skill tests will be considered. 
These methods are (a) test-retest correlation and (b) split-halves correlation 
in conjunction with the Spearman-Brown prophecy formula. Specifically, 
their application will be examined with reference to the example cited above— 
a skill test in bowling. Such a test might consist of the bowling of 20 balls 
on a single day. After each roll the pins are reset, so that each trial is per- 
formed with a full complement of ten pins. The final score assigned to each 
subject is the total number of pins scored in 20 trials. 

In order to estimate the reliability of the 20 trials by the test-retest method, 
an investigator would arrange for each subject to bowl a total of 40 trials. 
The total on the first block of 20 trials would constitute the first measure, the 
total on the second block of 20 would constitute the second. Because of the 
administrative necessity of testing subjects on a group basis in the average 
school class period of less than one hour, the two blocks of 20 trials would, 
in all probability, occur on different days. The test-retest reliability would 
ultimately be obtained by correlating performance on day number one with 
that on day number two. 

It is important to note how the adoption of this method has automatically 
resulted in the assignment of certain factors to the realm of “error.” One of the 
most important of these factors is the degree of “hotness” or “coldness” 
evidenced by each subject on the given days. Above or below average perform- 
ance on a given day has subtle causes which are almost impossible to isolate. 
Nevertheless, the phenomenon is manifested as much in the performance 
of the beginner as in that of the skilled professional. Since this effect is con- 
ceived to be constant for any subject on any single day but variable for any 
subject from one day to another, the test-retest correlation is lowered to the 
extent that this factor operates. Therefore, it may be concluded that this effect 
—this particular combination of day and subject which tends to raise or lower 
the subject’s score—has been relegated to error. 

There are a host of other effects that have been similarly classified. They 
include all such factors as day-to-day fatigue condition, mental set, bodily 
health, and level of daily motivation. The slight cut on one examinee’s finger 
which causes a change in grip and an ultimate lowering in his score on day 
number one is adding to the error component of this subject’s score. The 
blister on the heel, the worry about a failed exam, the daydream about 
tomorrow’s date, the general feeling of “not wanting to bowl today,” the 
poorly fitted ball—all these will contribute to the error variance when test- 
retest reliability is evaluated. The list of factors, major and minor, which 
vary from one day to the next, but which are constant for any given day, 
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would indeed be extensive. And each of these factors has been automatically 
assigned to the error component of each subject’s score. 

The split-halves technique might also be utilized in estimating the reliability 
of this test. For this purpose only a single day’s performance on 20 trials need 
be measured. In the typical application of this method, a total would be 
determined for the odd-numbered and even-numbered trials. The correlation 
would then be obtained between these measures and this value would be 
corrected by the Spearman-Brown prophecy formula. 

The effect of estimating reliability from 20 trials on a single day is to re- 
classify all of the factors enumerated above from the realm of “error” to the 
realm of “true.” A “hot” day by one subject will naturally be evidenced on 
odd-and even-numbered trials to the same degree; the “off” day scores of 
another subject will be similarly affected. All factors of health, fatigue, atti- 
tude, and motivation are systematically equated between the two scores for 
each subject. Even the warm-up effects and the progressive fatigue engendered 
by the test itself are evenly distributed between the odd score and the even 
score for each subject. Thus, the split-halves method has defined error factors 
in a much more restricted manner than has the test-retest method. 

To the extent that these factors contribute a significant proportion of the 
total variation among the obtained scores, the split-halves coefficient may be 
expected to exceed the test-retest coefficient. With respect to some skills tests, 
it might be reasonable to suppose that such factors contribute only a negligible 
part of the total variability, at least in comparison with that which results 
from true differences in ability among the subjects in the population. In such 
cases, the two methods should give comparable results. On the other hand, in 
a population of highly skilled bowlers, such factors might well account for 
more than half the variability in the obtained scores. In such a case, the two 
methods could be expected to yield divergent results. 

The question might well be raised, “To which category—true variance or 
error variance—should such factors contribute?” To this question there is no 
single answer. If the coefficient is to reflect the accuracy of test rankings at 
the conclusion of a semester’s training, such factors should be considered as 
increasing the error variance. For subsequent out-of-school performance will 
continue to reflect the vagaries of day-to-day differences. To count such 
factors in the true component of each subject’s score is to weight rather 
heavily the chance conditions which exist on the particular day on which the 
measurements are taken. 

On the other hand, the split-halves coefficient does indicate the reliability 
of the test at the particular moment in time in which it is administered. Such 
a value might be interpreted as the upper limit of the reliability that may be 
achieved over different days. This coefficient might also be used to indicate 
the need for increasing the number of trials on a single day as a necessary 
prerequisite to raising the between-days reliability. When measurements are 
taken during a learning period, it is possible that a second block of 20 trials 
may result in differential practice increments for individual examinees. In 
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such a case, the split-halves coefficient could represent the more appropriate 
estimate. 

The discussion thus far has been limited to the relation of split-halves and 
test-retest techniques to the definition of true and error scores. Such tech- 
niques, while fairly simple to apply and traditionally acceptable, do not repre- 
sent the most efficient techniques of estimating the ratio of true variance to 
obtained variance. Often the experimental data cannot be analyzed by usual 
product-moment correlation techniques to obtain a reliability estimate for a 
specified test series in which the experimenter is interested. Even when 
such techniques may be used, they almost always result in needless loss of 
valuable information. In the second part of this paper, a demonstration is 
made of the manner in which the techniques of analysis of variance may be 
used to obtain independent estimates of the variance contributed by various 
factors affecting the obtained scores. It is then indicated how these estimates 
may be combined to obtain a series of reliability coefficients, each coefficient 
being consistent with a different classification of factors into the categories of 
true and error. 


Part II. Analysis of Variance Approach to Reliability Estimation 

The value of analysis techniques is widely recognized in the analysis of 
experimental treatment effects. Their use in estimating components of vari- 
ance, with particular application to the reliability of physical education skills 


tests, is less widely recognized. These techniques, which are based on the 
pioneer work of Fisher (2), have several advantages over traditional ap- 
proaches to reliability estimation. Where several distinguishable sources of 
measurement error exist, the components-of-variance approach permits an 
evaluation of the relative importance of each. Usually, in such cases, several 
useful definitions of error, varying in comprehensiveness, can be phrased. 
The analysis of the error variance into components provides the basis for 
computation of a reliability coefficient consistent with each definition and 
makes this definition an explicit rather than an implicit one. The analysis 
also permits the experimenter to estimate the effect of a greater variety of 
modifications of the original test than can be estimated from the traditional 
Spear-Brown prophecy formula. This in turn makes possible a more con- 
structive approach to the design of an efficient skills test. 


Value of This Approach 


The value of this more flexible approach may be illustrated by its applica- 
tion to a hypothetical test of diving abiiity. In this hypothetical test each 
of a number of divers executes a certain dive twice for each of three judges. 
The six performances of the dive by each diver are independently observed 
and rated by the judges. The three judges may be regarded as a random 
sample from a population of judges, the divers a random sample from a 
population of divers, and the repeated executions a random sample from a 
population of such executions by the given diver. For this test it would be 
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possible to compute any number of product-moment reliability coefficients. 
For example, one might correlate the pairs of ratings by each judge, combin- 
ing in some way the data for the three judges. Or one might obtain the mean 
rating by each judge on each diver and correlate these mean ratings for all 
possible pairs of judges. Or one might obtain the mean rating on the first 
trial by the three judges and correlate this with the mean rating on the 
second trial. While each of these coefficients, and many others, could be 
described as a “reliability coefficient,” they are clearly not equivalent. They 
differ in the test series to which they refer and the factors which are classified 
as errors of measurement, though neither of these aspects is made explicit by 
the method of estimation. The ambiguity of these coefficients arises primarily 
from the fact that such coefficients are generally “coefficients of convenience.” 
They have not been derived from any theoretical consideration of the meaning 
of the concept of reliability, as it applies to the test, but rather from the arbi- 
trary necessity of reducing the data to two sets of measures which may be 
substituted in the product-moment correlation formula. A coefficient of such 
ambiguity can hardly be used as a basis for estimating the number of dives 
and the number of judges which will constitute a test of specified reliability. 

The analysis of variance approach forces the investigator to examine his 
definition of measurement error. In this test it is possible to identify two 
independent sources of unreliability. The first of these is represented by the 
inconsistency in the examinee himself. In any act calling for highly developed 
co-ordination and timing, human beings always evidence some variation in the 
quality of performance from trial to trial. Skilled divers may, by practice, 
reduce the extent of this variation, but it can never be eliminated entirely. 
Hence, the factors which manifest themselves in variation of this kind will 
always result in some degree of unreliability in the test. 

A second source of error is represented by the judges who must make a 
subjective rating of the quality of the performance. It seems reasonable to 
assume that no two judges exercise precisely the same subjective standards 
in the evaluation of a dive. While one judge may subconsciously assign a 
slightly heavier weight to the sharpness of the diver’s entry into the water than 
to his form in take-off from the board, another judge might unconsciously 
weight these aspects in the reverse order. Thus, even two judges who do not 
differ in the severity of their ratings over all divers can be expected to differ 
in their ratings of individual divers. A diver who excels in those specific 
aspects which are rated more heavily by Judge A than by Judge B will 
receive higher ratings from Judge A than Judge B. Similarly, the diver 
who does well the things weighted most heavily by Judge B will stand to 
benefit from the idiosyncrasies of his scoring system. 

This phenomenon is called the judge-by-diver interaction, and it would be 
numerically quantified by the interaction term in a two-dimensional analysis 
of variance. Unlike the factors associated with examinee inconsistency, this 
effect might or might not be regarded as error, depending upon the definition 
of the test series. If the judges were regarded as a random sample from a 
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population of judges and the measurement situation always included an 
independent sample from this population, we might consider this factor as 
contributing to error. Such would be the case, perhaps, in a swimming meet 
to which judges were assigned at random from a hypothetical population of 
judges. In such a situation, it would be a fortuitous circumstance whether the 
diver who excelled in one particular aspect of the dive would stand to benefit 
to some slight degree. Part of the inconsistency in any pair of ratings by 
different judges would be attributable to this factor. On the other hand, if we 
presume the judge or judges remain constant from one test to another, as 
would the instructor in a diving class for example, this factor would not be 
classified as an error of measurement. In this situation, the peculiar idiosyn- 
crasies of the judge’s scoring system would account for a consistent, systematic 
advantage for some divers over others. This factor would therefore contribute 
to the “true” variation among divers. 

Given the appropriate experimental setup, the analysis of variance approach 
to reliability estimation would permit flexible handling of such an effect. The 
experimenter could compute reliability coefficients under both definitions, 
and could assess the relative importance of this source of error. In this 
hypothetical example, the analysis of variance technique would also enable 
the investigator to estimate the changes in reliability which would result 
from various modifications of the test. It would be possible, for example, to 
estimate the effect of increasing the number of dives observed by each judge 
while retaining the same number of judges. Also, it would be possible 
to estimate the effect of increasing the number of judges while holding con- 
stant the number of dives observed by each. More generally, it would be 
possible to approximate the reliability of the test defined by any combination 
of one or more dives observed and rated by one or more judges. Thus the 
experimenter would be in a position to design the most efficient test consistent 
with local restrictions on the number of available judges and limitations 
imposed by examinee fatigue. 

It would be impossible in this paper to give a complete exposition of the 
techniques of analysis of variance and their mathematical bases. A number 
of textbooks in this field (3, 4, 5, 10) are probably familiar to serious 
students of research methodology. The text by Lindquist (5) will probably 
be most useful to investigators who wish to utilize the techniques in reliability 
studies in view of the extensive discussion given this topic (Chapter 16). It 
should be possible, however, to suggest to the sophisticated reader via a 
worked example the potentialities of these techniques for reliability analysis. 


Analysis of Reliability of a Test 

For illustrative purposes an analysis will be presented of the badminton 
wall volley test described by Lockhart and McPherson (6) and Scott and 
French (9, pp. 74-76). In this test the examinee is required to volley the bird 
off a vertical surface or wall as fast as possible for a period of 30 seconds. 
The measure of performance is taken as the number of successful returns 





Estimation of Reliability of Skill Tests 287 


executed before the end of the 30-second period. A number of trials, usually 
three or four, constitute the test series. It is typically used in physical edu- 
cation classes for homogeneous grouping of students and the assessment of 
individual achievement. 

An introspective analysis of this test suggests that measurement error may 
be analyzed into two presumably independent components: (a) that associ- 
ated with trial-to-trial variation within a given testing period and (b) that 
associated with day-to-day variation from one testing period to another. 
Factors which contribute to the first component include the many variable 
elements involved in the co-ordination, timing, reaction time, perception, 
temporary muscle fatigue, etc., which are called into play on any particular 
volley. They include the almost infinite number of chance elements which 
make for greater or lesser success on any given shot. The second component 
represents what might be called the “good-day, bad-day” phenomenon, which 
characterizes the skilled performance of even expert players. Error factors 
associated with this second component include all those which vary from 
one day to the next, but which are constant for any given day. They include 
conditions of body fatigue, physical health, mental attitude, temporary moti- 
vational state, distractability, and the many others which can be only vaguely 
described. 

In designing a reliability study of this test, it was first necessary to assume 
a mathematical model consistent with this analysis of the measurement situ- 
ation. It was then necessary to plan the experiment so that the requirements 
of the model would be met and the data would provide estimates of the as- 
sumed variance components. These steps are analogous to those which precede 
the actual gathering of data in any well-planned experiment. In this case a 
relatively simple model was assumed. In mathematical form, the model for 
the observed score (x) of each examinee was represented by the sum of three 
independent components: A “true” component (t) representing the hypo- 
thetical level of ability of the examinee which might be ascertained only 
through an infinite series of trials on an infinite number of days, an error 
component (e;) associated with trial-to-trial variation, and an error com- 
ponent (e2) associated with day-to-day variation. Symbolically, this becomes 


x=t+ea+ e. (Formula 1) 


It might well be argued that the model adopted for this analysis represents 
an oversimplification of the measurement situation or that it does not take 
into consideration certain important effects. The model assumes that no indi- 
vidual growth in skill occurs during the entire test series. Also, the model 
makes no provision for “warm-up” or practice effects which might occur on 
the first few trials within any testing period. Such systematic effects, to the 
extent that they occur, are inappropriately counted as errors of measurement 
under this model. However, by limiting the test to four trials in each of two 
class periods closely spaced in time, the experimenters tried to insure that 
little or no learning would occur during or between testing sessions. Also, by 
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allowing each examinee several preparatory trials before the test proper, the 
experimenters hoped to reduce the importance of the warm-up factor. Thus, 
the experimental procedures were so planned as to permit the legitimate use 
of this relatively simple model. 

The model defined by Formula 1 calls for a two-dimensional analysis of 
variance in which “Testing Periods” are represented by the first dimension 
and “Subjects” by the second. The repeated measurements on each subject 
in each period constitute the “within-cells” replication of measures. Lindquist 
(5, p. 381) has called this model “Groups (of Observations) Within Subjects.” 

The first error component, that associated with variation from trial to 
trial within a testing period, is evidenced by variability within cells in this 
model; this component plus the second—the “good-day, bad-day” effect— is 
included in the cell-to-cell variation within subjects. The following notation 
will be adopted for the various variance components assumed in the model: 

ae = trial-to-trial error variance 


2 
Geo == day-to-day error variance 


2 
ot = variance of true scores for the population of examinees. 


The summary table for the model, and the expected values of the various 
mean squares, is presented in Table 1. The sample estimates for these com- 
ponents are noted in Table 2. In these tables N represents the number of 


experimental subjects, n the number of trials in any period, and a the num- 
ber of testing periods. 


TABLE 1 
Symbolic Model for the Analysis of the Reliability of the Badminton Wall Volley Test 


Source af ms Expected Value of the Mean Square 








2 2 2 
Between subjects N-l Toy + 2 Gey + an or 


Between periods within subjects) N (a-1) ons +n on 


Within cells | aN (n-1) Pd 


Total aNn-l 




















TABLE 2 
Sample Estimators of Variance Components 
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This model was applied to data gathered on an intermediate group of 24 
women students at the State College of Washington in November 1957. 
Students assigned to this class had had one semester of badminton instruction 
at the college level or demonstrated skill and knowledge comparable to that 
of the average student at the end of a one semester course. At the time of 
testing, the class was in the seventh and eighth weeks of instruction. Consistent 
with the requirements of the model, the test was administered in two sessions 
which, to insure independence of the observations, were scheduled a week 
apart. At the beginning of each testing period, all students were permitted 
one minute of “timed practice” to warm up, using a new bird under test 
conditions. The test rules and administrative mechanics were then briefly 
reviewed before the actual testing was begun. The test was administered to 
the entire class in the first class meeting of the week. The summary table for 
the analysis of variance is reported in Table 3. 


From the mean squares reported in Table 3, and the formulas for the 
sample estimators of the variance components noted in Table 2, estimates 
were obtained for the three components assumed in the model. These are 
reported in Table 4. It may be noted that error variance associated with 

2 
,day-to-day changes in the examinee (%2) is considerably smaller than that 


2 
arising from chance factors associated with a particular trial (¢;). 


However, the former does make a unique contribution to the error variance 
which is not insignificant. 

With the values presented in Table 4, we may estimate reliability of the 
mean (or sum) or any number of trials (n’) on each of any number of 


TABLE 3 
Summary Table of the Analysis of Variance 





Source | Sum of Squares Mean Square 


Between subjects —___ 8079.95 351.30 

Between periods within subjects 1332.13 55.51 

Within cells 144 3722.25 25.85 
Total 191 13,134.33 














TABLE 4 
Sample Estimates of the Variance Components 
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days (a’). These estimates may be derived from a general reliability formula 
for this model based on the definition of reliability as the ratio of the variance 
of true scores to the sum of the variance of true scores plus the variance of 
error scores. This formula is as follows: 





Reliability (of the mean of n’ trials on a’ days.) = 


(Formula 2) 


a'n’ 
In the case being considered, the formula for estimating reliability coefficients 


becomes 
36.97 





7.42 25.85 (Formula 3) 
36.97 + —— + 


, rer 


a an 


A number of specific reliability coefficients may be computed from Formula 
3. For example, the reliability of the mean of eight trials, four on each of 
two days (the set-up actually employed in the reliability study), would be 
estimated by substituting a’ — 2, n’ = 4: 


36.97 





= S61 
7.42 25.85 
eae ST Ea 
2 8 

This coefficient would be comparable to the correlation between one mean 

of eight trials, four on each of two days, and a second mean of eight additional 

trials on two additional days. 

The estimated reliability of the mean of four trials on one day is 


36.97 
om te 





7.42 25.85 
Oe Snel ea 
1 4 


This coefficient is comparable to the correlation between the mean of four 
trials on one day with the mean of four more trials on another day. 

The theoretical limit of the reliability obtainable in a single testing session 
would be found by letting a’ = 1 and n’ approach infinity. In this case, the 
limiting reliability would be 


36.97 
r= = 833. 
7.42 
36.97 + —— 
1 
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This coefficient estimates the hypothetical correlation which would obtain 
between the mean of an infinite number of trails on one randomly selected 
day and a second such mean for a second randomly selected day. Other co- 
efficients could be obtained by substituting appropriate values for a’ and n’. 

It should be noted that this reliability analysis was performed on a rather 
small sample of subjects—considerably fewer than should be utilized in such 
a reliability study. In actual practice, experimenters should not be tempted 
to make a thorough analysis of test reliability with fewer than several hundred 
cases. It may be expected that the sampling error in analysis of variance 
estimates is comparable to that of product-moment coefficients based on the 
same numbers of cases. 


Relationships Between Methods 


The relationships between analysis of variance estimates and those pro- 
vided by the split-halves and test-retest techniques and are not immediately 
obvious. The difficulty arises from the fact that, under the latter techniques, 
the definition of error factors is rarely explicitly and unambiguously stated. 
It is possible, although only with difficulty on many occasions, to make explicit 
the definition of error underlying reliability estimates derived by traditional 
techniques. When this is done, one may usually identify quite readily the 
analysis of variance estimates comparable to them. For example, one com- 
monly employed split-halves method involves the summing or averaging of 
Trials 1 and 2 on a given day and correlating this value with the sum or aver- 
age of Trials 3 and 4 on the same day. The obtained correlation is then 
corrected by the Spearman-Brown formula. It may be shown that the estimate 
secured by this method defines the first error component as error, as it should, 
but defines the second error component as true rather than error variance. 
Thus the “good-day” or “bad-day” effects which characterize individual 
examinees on the day of testing are implicitly classified as permanent, 
systematic characteristics, rather than temporary, variable characteristics 
which change from day to day. The user of this technique is thus defining 
errors of measurement in a rather restricted fashion, probably unknowingly. 
A reliability estimate can be derived from the analysis of variance approach 
which would be comparable to this split-halves coefficient. For such an esti- 

2 
mate, tk > term involving the second error variance component (¢,2/a’) would 
be included with the true variance component in the numerator of reliability 
formula (2). Letting a’ = 1 and n’ = 4, we obtain a “split-halves” reliability 
of .87 for 4 trials in one day. This value is consistent with those reported by 
Scott and French (9). 

Another common technique for estimating the reliability of this and 
similar tests—the test-retest method—involves the averaging of the four trials 
in one session and correlating this average with that for a second session. 
This method is preferred to the split-halves method since it implicitly defines 
both error components as error. Such a coefficient is directly comparable to 
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that derived in the foregoing analysis from Formula 2 with a’ = 1 and n’ = 4. 
The value of this coefficient (.73) is somewhat lower than the split-halves 
reliability reported above since the day-to-day error factors are also taken into 
consideration. 

The analysis of variance approach to reliability estimation yields approxi- 
mations which are equivalent to those derived from traditional approaches, 
and, in addition, gives rise to many useful estimates which cannot be con- 
veniently derived by traditional techniques. For example, suppose the experi- 
menter wished to estimate from experimental data such as those secured in 
this study the number of trials per period needed to bring the reliability to 
.90, assuming only two periods are available for testing. No manipulation of 
traditional estimates through the Spearman-Brown formula will provide such 
an estimate. Neither can the prophecy formula be used to approximate the 
comparative reliabilities of, say, six trials in a single period as against three 
trials in each of two periods. Quite often experimenters who are quick to use 
the Spearman-Brown formula are not clear as to the test series to which the 
estimates actually refer. Formula 3, on the other hand, yields unambiguous 
estimates of the reliability of any arbitrary number of trials in any given 
number of testing periods. In addition, it allows a flexible but explicit defi- 
nition of error factors and permits a convenient evaluation of the relative 
importance of the various sources. 

The analysis of variance approach is particularly suited to reliability 
analyses in physical education because of the rather common occurrence of 
situations in which several components of error variance may be distinguished. 
Also, the measures used in skills testing usually meet the requirements of the 
mathematical models, in that the test series may almost always be replicated 
at least several times without significant practice or fatigue effects. In addi- 
tion to situations in which the “day” effect is of some importance, the tech- 
niques should find important application in the many areas in which examinee 
inconsistency and judge inconsistency constitute independent sources of un- 
reliability. Such would be the case, for example, in diving or gymnastics. 
They might also be applied in measurement situations ‘involving equipment 
which might vary in significant respects from one examinee to another. 
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Abstract 

Two groups of 23 junior high school boys were tested before and after an eight 
weeks’ progressive resistance training program. The experimental group participated 
in this program, but the control group took part only in regularly scheduled physical 
education classes. At the end of the eight weeks, it was found that the experimental 
group increased their ability to do pull-ups, push-ups, the Harvard Step Test, Dodge 
run, the Burpee test, and trunk extension and flexion. The control group improved in 
the Dodge run, the Burpee test, push-ups, and trunk extension. In no case did the 
improvement of the control group exceed the improvement of the experimental group. 
The experimental group also increased in anthropometric measurements. Medical exami- 

nations indicated that no harmful effects were experienced by either group. 


SINCE THE early years of World War II interest in training with weights 
has increased rapidly. Advocates of progressive weight training have claimed 
many physical benefits through its practice (6, 7, 11, 12). On the other hand, 
athletic coaches and physical educators have advanced opinions warning of 
deleterious effects which they attribute to the practice of progressive weight 
training (10). Both groups base their statements primarily upon opinion 
unsupported by evidence. 


Studies have been published which were designed to ascertain the effects 
of progressive weight training on strength, athletic power, co-ordination, 
speed, and certain skills abilities. The majority of these studies have dealt 
with late adolescent and adult males. Also, none of the published reports 
have examined the effects of progressive weight training upon total physical 
fitness (3, 4, 9, 15, 16). 

This experiment was designed to determine the effects of a two-month 
systematic program of progressive weight training on the physical fitness of 
boys in early adolescence. 


Procedure 


Forty-six adolescent males attending a New York City junior high school 
were subjects. They ranged in age from 12 to 17 years. Volunteers who 
obtained parental consent were examined by the school physician. Those 
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who evidenced hernia or potential hernia, as well as those with cardiac 
conditions, were excluded from the experiment. 


The 46 boys participated in the pre-training test which was designed to 
determine physical fitness status before the conditioning period was started. 
In order to measure the components of physical fitness, test items were se- 
lected that have been used in one or more accepted test batteries for the 
measurement of physical fitness; are susceptible to rapid, accurate, and 
reliable measurement with a minimum of apparatus; and are valid measures 
of the attributes to be tested. Following are the components, and those test 
items selected to measure them: 

1, Anthropometric proportions 

(a) Height and weight. 

(b) Muscular girth: measurement of the limbs, chest, and waist. Limb measure- 
ments were taken over tensed muscles with the tape held firmly as advocated 
by Cureton (2). 

. Muscular strength and endurance 

(a) Push-ups (from floor). 

(b) Chinning. 

(c) Standing broad jump. 

(d) The experimental group’s changes in strength are indicated by the increasing 
amount of resistance handled in the various exercises. 

3. Flexibility 

(a) Trunk flexion forward (sitting on floor). 
(b) Trunk extension backward, as described by Cureton (5). 
. Speed and agility 
(a) Squat thrusts (4-count Burpee). 
(b) Dodge run (McCloy’s pattern). 
5. Circulatory-respiratory function 
(a) Harvard Step Test as modified for boys of 12 to 18 years of age (13). 


The subjects were tested, measured and divided into two groups. The 
groups were equated so that their means were not significantly different (Wil- 
coxon Sum of Ranks Test) (14) for the following items: strength score, 
(total chins plus total push-ups), Harvard Step Test score, age, ponderal 
index, ethnic origin (the latter in order to minimize the effect of type of diet). 
A coin was tossed to determine which of the two groups should become the 
experimental group. 

The subjects in the experimental group participated in a progressive 
weight-training program three times weekly for a period of eight weeks. 
Training periods provided approximately 45 minutes of actual exercise. The 
program consisted of seven resistance exercises: curl, military press, supine 
press, squat, pullover, and sit-up. During the first two training weeks, one 
series of each exercise was performed; during the next three weeks, two series 
of each were practiced; and during the final three weeks, three series of 
each were performed. A beginning weight was used which permitted eight 
repetitions. When 12 repetitions could be performed, the resistance was in- 
creased to a magnitude permitting the subject to perform eight repetitions. 

The control group received no conditioning other than participation in 
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the regular physical education program. At the conclusion of the training 
period, both groups were re-tested and re-examined. 


Statistical Treatment 

Pre-training and post-training scores and measurements were compared 
for significance of mean differences. In cases where the experimental and 
control groups both made significant gains, the gains were compared. 

The Wilcoxon Signed Ranks Test for Paired Observations was used to 
compare the pre-training and post-training results (14). For comparing 
differences between the experimental and control groups, the Wilcoxon Sum 
of Ranks test was used (14). Although these non-parametric methods are 
not as efficient as the more conventional students t test, or the analysis of 
variance, it was decided to employ them for the following reasons: The 
subjects were volunteers and not randomly chosen from the entire population 
of teen-age boys; there was a question as to whether the variance would be 
homogeneous as the changes in size and fitness are similar to growth and 
learning changes which often show heterogeneous variance; and finally the 
normality of the sample was questionable. 


Results 

Anthropometric Measurements. Changes in anthropometric measurements are 
listed in Table 1. In general, the experimental group made small but sta- 
tistically significant gains, while the control group gained in height and 
weight only. Since both groups were made up of subjects in their early 
teens, growth in height and weight was to be expected. 

The subjects in the experimental group grew an average of 0.19 inches 
in height, and those in the control group grew an average of 0.57 inches. 
Both gains were statistically significant at the 1% probability level, but 
were not significantly different from each other. The average gain of the 
control group was affected considerably by one subject who grew 2.5 inches 
during the two-month period. 

The mean weight change in the experimental group was minus 2.3 pounds, 
and in the control group was plus 2.04 pounds. These changes were not 
statistically significant, owing to the great variability between subjects within 
each group. In the control group, 16 of the 22 subjects gained weight, but 
of the six who lost, one lost ten pounds, a loss that neutralized several of the 
one- or two-pound gains. In the experimental group, 15 subjects gained 
weight. Of those who lost, the subject who lost the greatest amount (eight 
pounds) was a 14-year-old youth who had an initial weight of 184 pounds 
and a height of 5 ft. 7 in. The subjects in the experimental group showed 
increases in measurements of girth of neck, chest, biceps, forearm, thigh, 
and calf. The gains ranged from .3 in. in calf girth to 2.13 in. increase in 
chest girth. All of these gains or increases were significant at the 1 per cent 
level of probability. The waist girth of the experimental group decreased 
an average of one-half inch. The control group showed no significant changes 
in any of these measurements. 
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Performance tests. The results of the eight performance tests selected to meas- 
ure the fitness gains of the subjects are listed in Table 2. The strength and 
endurance of the arm and shoulder girdle were improved markedly in the 
experimental group as shown by the increase in the average number of pull- 
ups and push-ups. It is interesting to compare the results of these two test 
items with the results of the gains in weight lifted, in curls, and presses which 
are shown in Table 3 and Figure I. While the average number of pull-ups 
increased from five to seven (40%), the average number of push-ups in- 
creased from 8.4 to 16.8 (100%). In contrast to this, curling ability is 
increased from 35 to 73.4 pounds (110%), and the ability to press increased 
from an average of 40.2 to 64.3 pounds (60%). Since curls and pull-ups 
depend primarily upon flexor strength, and push-ups and the press on ex- 
tensor strength, these conflicting results are difficult to explain. The control 
group made no significant gain in pull-ups, and increased the average number 
of push-ups by 2.4, a gain which was significant at the 1 per cent level, but 
was significantly smaller than the gain of 8.4 found in the experimental 


group. 


TABLE 3 
Mean Weight (Pounds) Used by Experimental Group During Eight Weeks of Training 


Weeks 

Exercise | 2 4 6 8 
50.2 58.4 65.4 73.4 
Military press 52.9 55.4 58.6 64.3 
Supine press . 57.9 63.1 73.2 818 
Rowing exercise : 68.6 78.2 89.8 97.7 
70.9 95.9 115.5 137.5 




















The only other tests in which the experimental group clearly gained more 
than the control group were the trunk flexion, and the Harvard Step Test. 
In the former, the experimental group increased the mean result by 1.4 
inches. while the control group made no statistically significant increase. In 
the Harvard Step Test, the experimental group increased their mean score 
from 71.67 to 80.33, a gain which was significant at the 5 per cent level of 
probability. The control group made no significant increase in this test. 

In the standing broad jump, neither group improved significantly. In the 
Burpee test, the Dodge run, and trunk extension, both groups improved sig- 
nificantly; however, in the Dodge run the experimental group’s gain was 
significant at the 1 per cent level, while that of the control group was approxi- 
mately half as great, and was significant at the 5 per cent level. The improve- 
ments seen in both groups in these three tests indicate that either the increases 
were due to learning or that the weight-training program, and the conven- 
tional physical education program were equally effective in aproving the 
attributes measured by these items. 

Load Increases During the Training Period. The increases in strength of 





Progressive Weight Training for Boys 








S.- SQUAT, R.E.-ROWING EXERCISE, 
S.P.— SUPINE PRESS, C.~CURL, 
M.P.— MILITARY PRESS. 


i i 
4 6 
WEEKS 








Ficure I, Average Resistance Handled by Subjects. 


the subjects in the experimental group are roughly indicated by the increased 
amount of weight used in each of the exercises as the training program pro- 
gressed. Table 3 and Figure I show the mean amount of resistance handled 
by the subjects over the eight weeks’ period. In every case, the strength 
increase was rectilinear if the weight used in the first week’s training is not 
considered. It is probable that the load used this first week was not truly 
representative of the maximum ability of the subjects, at that time. The more 
rapid increase most likely was due to skill development. 

Health Examination Results. According to the school physician, there were 
no changes in health status as determined by the pre-training and post-train- 
ing medical examinations in either the control or experimental groups. The 
nature of the medical examination is such that the appearance of hernia, or 
heart pathology, would be easily detected had such injuries occurred. How- 
ever, due to the short-term nature of this experiment, no possible deleterious 
effects on growth could be observed. 


Discussion 


The greater increasvs in muscle girths and number of chins and push-ups 
found in the experimental group were expected. A systematic program of 
resistance exercise is undoubtedly more effective then routine physical edu- 
cation programs in the development of these attributes. The increase in 
circulatory-respiratory function of the experimental group, as shown by 
the increase in Harvard Step Test results, was an unforeseen finding. Appar- 
ently, the demands made upon the circulatory-respiratory apparatus were 
significantly greater in the weight-training program than in the regular physi- 
cal education class periods. The degree to which this function could be 
improved by weight training is not ascertainable from this experiment. 
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In view of the often expressed criticism of weight training, as having dele- 
terious effects on speed of movement, agility, and flexibility, it is of consider- 
able importance to note that the experimental group was able to improve 
as much as, or more than, the control group in every item selected to measure 
fitness. 

The differences in the slopes of the curves of resistance used during training 
(Figure I) appear to be related to the size and degree of potential develop- 
ment of the muscle group involved. For example, in the press the subjects 
showed the least amount of improvement, while in the squat they showed the 
most. The muscle groups used in the former are relatively small. In the case 
of the squat, the muscle groups are quite large and seldom used for activities 
of a strength-building nature. For the most part the walking, running, and 
bicycle riding done by boys tends to develop endurance rather than strength, 
in this muscle group. 

It should be noted that the use of the resistance handled in the various 
exercises is at best a rough measure of strength and depends a great deal on 
the willingness of each subject to extend himself fully in attempting to com- 
plete the required number of repetitions. Under the conditions of this experi- 
ment, however, careful supervision eliminated tendencies toward careless per- 
formance of the training procedures on the part of the subjects. 


Summary and Conclusions 


Forty-six Junior high school boys between the ages of 12 and 17 were 
divided into two groups which were equated on the basis of arm strength, 
Harvard Step Test score, age, ponderal index, and ethnic origin. Both groups 
were given a series of eight performance tests aimed at estimating various 
components of physical fitness. Measurements of height and weight, and 
neck, chest, arm, forearm, waist, thigh, and calf girth were also made on 
both groups. The randomly chosen experimental group participated in an 
eight weeks’ progressive weight-training program in place of regular physical 
education class, while the control group took part only in scheduled physical 
education activities. 

At the end of the experiment, both groups were again measured and tested 
in the same manner as before, The experimental group averaged slight gains 
in all anthropometric measurements except weight and waist girth. The 
control group gained in height and weight only. In the performance tests, 
the experimental group improved in pull-ups, push-ups, the Harvard Step 
Test, Dodge run, Burpee test, and trunk extension and flexion. The control 
group showed improvement in the Dodge run, the Burpee test, push-ups, and 
trunk extension. In no case did the improvement of the control group 
significantly exceed the improvement of the experimental group. 

In addition to the above-mentioned performance tests, the strength gain 
of the experimental group was indicated by the increase in weights used 
during the training period. Medical examinations indicated that no harmful 
effects were experienced by either the experimental or control group. 
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Place-Kicking in Football 
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Brookings, South Dakota 


Abstroct 


This study is an attempt to determine by experimentation with a mechanical kicking 
machine the effect of the following factors upon the place kick for distance: the point 
of the impact of the toe on the ball, the size of the angle between the kicking leg and 
the vertical at the time of the impact, the size of the angle between the long axis of the 
football and the vertical at the time of the impact, the type of football used (rubber or 
leather), the use of the detachable rubber kicking toe, the placement of the laces, and 
the inflation pressure of the football. 


IN ORDER to study seven factors affecting place-kicking (see Abstract, 
above), the writer constructed the mechanical! kicking machine pictured in 
Figure I. The principal operating components of this device are the following: 

1. A pendulum-type kicking Jeg welded to an axle suspended between two inverted 


V-type supports. The /eg was 34 inches in length and has a stop attachment for 
holding the foot in a fixed position while the ball is being set. 


2. A cement foot bolted to the iron leg. 


3. Two extensiva-type springs and ten feet of airplane shock cord supplied the power 
for kicking the football. A total of 460 pounds of force were exerted by the springs 
and cord when they were fully extended for each kick. 


. A steel lever attached to an axle. This lever is used to draw the leg back for the 
kick. 
Other items of equipment used in the study were the following: 


1. Four official, leather footballs properly laced and inflated to a pressure of 12%4-13% 
pounds, 


. One rubber football properly inflated to a pressure of 12%4-13% pounds. 
One regulation football shoe. 

One detachable rubber kicking toe. 

One official rubber kicking tee. 

One 200-foot steel measuring tape. 

. One enlarged protractor (constructed of composition board by the writer). 
One wooden platform. 


. Five yardage markers placed at distances of 30, 35, 40, 45, and 50 yards from the 
machine. 


CPN ANSwp 


Procedure 


The writer first attempted to analyze the optimum mechanical factors (dis- 
tance from axis, tilt of ball, and resultant point of impact) necessary in 
obtaining the longest possible kick with the machine. 
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Ficure I. The mechanical kicking machine constructed and used by the author. Two 
extension-type springs and 10 feet of airplane shock cord supplied the power with 
which the football was kicked by the cement foot on the iron leg. 


For the first kick, the ball was placed directly under the axle, with the 
top tilted to the rear five degrees. This slight tilt was necessary to keep the 
ball from tipping forward from the toe. The point of impact was measured 
from the lower point of the ball and recorded. This point was established 
by chalking the toe of the shoe so that a white mark could be seen on the 
ball at the point of impact after each kick. The distance that the ball traveled 
in the air was also recorded. The writer classified each kick as to kind by 
observing the appearance and height of the ball in flight. This description 
was also recorded. 

The ball was moved forward an inch between the kicks until the distance 
the ball traveled began to decrease. Upon reaching this point, the ball was 
moved to the rear one inch between the kicks, with the top tilted to the rear 
ten degrees. When the distance that the ball traveled in the air began to 
decrease again, the ball was moved forward an inch between kicks, with the 
top tilted to the rear fifteen degrees. This process of trial and error was 
repeated until the optimum tilt and ball placement were located. When the 
ball was kicked from this point with this optimum tilt, the longest possible 
kick was obtained. 

After the optimum conditions of tilt and placement had been located, the 
following comparisons were made: 

1. Composition of Ball: The distance a rubber football will travel in the 
air was compared with the distance a leather football will travel in the air 
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TABLE 1 
Location of Optimum Points 





Distance Ball 
Angle of Tilt Traveled in 
of Ball the Air 
(degrees) (yards) 

5 14 
14 
15 
15 
16 
17 
19 
21 
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welll wor ws eRe BN 


Bo eo eo 
HOE NNNN 


1 

1 

2 
H2 
H2. 
H2 
25 H3 
25 H3 
25 H3 
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9 
10 
ll 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 





when kicked under exactly the same optimum conditions of location, tilt, 
point of impact, and force. 

2. Kicking Surface: The distance a leather football will travel in the air 
when it is kicked by the machine with the regular football shoe was compared 
with the distance the same ball will travel when it is kicked by the machine 
with the detachable rubber toe attached to the shoe. The same optimum 
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conditions of location, tilt, point of impact, and force prevailed in each 
attempt. 

3. Position of Lacing: The distances and the directions a leather football 
will travel in the air when it is kicked with the laces facing to the rear, 
forward, to the left, and to the right, were compared. The same optimum 
conditions of location, tilt, point of impact, and force prevailed in each at- 
tempt. 

4. Degree of Inflation: The distances obtained in kicking four leather 
footballs that were inflated to 9, 11, 13, and 15 pounds of pressure, respec- 
tively, were compared. The same optimum conditions of location, tilt, point 
of impact, and force prevailed in each attempt. 


Analysis of the Data 


The first experiments were performed in an attempt to locate the following 
optimum points: 1. The point of impact of the toe on the ball; 2. The size 
of the angle between the leg and the vertical at the time of the impact; 3. The 
size of the angle between the long axis of the football and the vertical at the 
time of the impact. 

Table 1 contains the results of these experiments. The writer found that 
the machine consistently (Several kicks were made under the conditions out- 
lined for kick number 28, Table 1.) kicked the ball over 45 yards when the 
tee was set 15 inches in front of the point directly beneath the axle. When 
kicked at this point, the ball traveled the greatest distance when its top was 
tilted 15 degrees toward the kicking machine. The optimum point of impact 
was found to be five and one-half inches up on the ball. When the toe made 
contact with the ball placed in this position, the angle between the leg and the 
vertical was approximately ten degrees. 

By observation the writer roughly classified each kick into one of three 
groups: 1. Slow reverse spin; 2. Rapid reverse spin; 3. Floating. H for high 
and L for low were used to prefix the description of those kicks that traveled 
unusually high or unusually low (Table 1, column 6). 

With the ball placed at any of several other positions regarding the tilt and 
the placement, kicks of 40 to 45 yards were obtained. When the ball was set 
in one of these positions, the toe apparently struck it at a point relative to 
the center of gravity that was comparable in this respect to the optimum 
position mentioned above. 

After the optimum position of the ball had been located, the writer com- 
pared the leather and rubber footballs by kicking each ten times, with all 
controllable factors held constant. The results in Table 2 show that there 
was no significant difference between the distances obtained in kicking the 
rubber and the leather football with the :vachine. 

The small variations in distances were caused by errors in placing the 
ball and in cocking the machine. If the stop attachment engaged the catch 
on the leg slightly off center, the ball was kicked slightly to the left or the 
right with a resulting reduction in distance. Because of these and other 
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TABLE 2 
Distance Comparison, Rubber and Leather Footballs 





Rubber Ball Leather Ball 





| Distance Distance 

| (yards) (yards) 
45 45 
46 . 47 
47 47 
48 46 
47 47 
45 48 
46 47 
47 45 
47 46 
47 47 
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TABLE 3 
Distance Comparison, Regulation Football Shoe and Detachable Rubber Toe 





Regulation Shoe Detachable Rubber Toe 





| Distance | Distance 
| (yards) | (yards) 





REE 


S&SSERRRRE 


1 
2 
3 
4 
5 
6 
7 
8 
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TABLE 4 
Effect of Laces upon the Flight of the Football 





Laces to Right | Laces Facing Kicker 





Distance Kick Distance 
(yards) (yards) 


46 45 





45 46 
47 47 
45 46 
45 45 


45.6 | 45.8 
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TABLE 5 
Effect of Varying the Air Pressure upon the Distance that the Ball Travels in the Air 





Amount of 
Air Pressure Distance 
(yards) 





10 
ll 
12 
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errors, two or three attempts were usually necessary to obtain one kick 
which could be recorded. 


Table 3 shows comparisons between ten kicks under optimum conditions 
using the detachable kicking toe and ten kicks using the regulation football 
shoe. There was no significant difference between the kicks. The value of 
the rubber toe, if any, may lie in the larger flat surface available to strike 
the ball. With this device the kicker’s margin of error in hitting the ball 
may be somewhat larger without causing the ball to deviate to the left or to 
the right in flight. 

The author next experimented with the placement of the laces to determine 
the effect upon the flight of the ball. In Table 4 the writer has recorded the 
distances of five kicks under optimum conditions with the laces placed to the 
right, and the distances of five kicks under optimum conditions with the laces 
placed so that the kicking toe struck them. Since there was no significant loss 
in distance and he lateral deviations were so small and erratic, it may be 
concluded that the placement of the laces is relatively unimportant when a 
slowly revolving kick is obtained. Although the author did not experiment 
with the placement of the laces in the high, rapidly revolving, type of place 
kick, it seems reasonable to expect a somewhat greater lateral deviation 
under such conditions. No experimentation was performed with the laces 
facing the left, for the machine would produce exactly the same result as it 
did when the laces were placed at the right. 

In Table 5 the author compares the distances obtained in kicking regulation 
leather footballs which were either over-inflated or under-inflated. Four balls, 
one inflated to 9 pounds, one to 11 pounds, one to 13 pounds, and one to 
15 pounds, respectively, were kicked three times each under optimum condi- 
tions, The results showed a surprisingly small deviation between the dis- 
tances produced when the ball inflated to 15 pounds was compared with the 
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ball inflated to 9 pounds of pressure. Although the ball inflated to 9 pounds 
of pressure could be compressed slightly with the hand, one of the highest, 
longest, and best appearing kicks was obtained from kicking it. Oficial 
pressure for game footballs is from 1214 pounds to 1314 pounds. 


Summary and Findings 


The results of this study conclusively demonstrate some of the fundamental 
factors concerning football place-kicking. However, it should be remembered 
that differences in leg length from individual to individual make a difference 
in the optimum ball position. A person with a long leg swings his foot in a 
flatter arc, and therefore places the ball farther forward to obtain the same 
results. On the other hand, an individual with a short leg necessarily swings 
his leg in an arc that rises more sharply. He places the ball closer to his 
body to obtain the same results. 

Nevertheless, the following fundamental conclusions may be stated: 

1. The optimum point of contact for the toe on the ball was approximately 
five and one-half inches up on the seam when the ball was tilted 15 degrees 
toward the kicker and the tee was set 15 inches in front of the point directly 
below the axle. When the ball was placed in this position, the angle between 
the kicking leg and the vertical was approximately ten degrees. If the ball 
was moved away from the kicker and the tilt increased, it was possible to 
locate other positions from which the ball could be kicked nearly as far. 

2. There were no significant differences in distances obtained in kicking 
the leather football as against the distances obtained in kicking the rubber 
football. 

3. The use of the detachable rubber kicking toe produced no significant 
changes in the distance that the ball could be kicked. 

4. The position of the laces made very little difference in the direction 
of the flight or the distance which the ball traveled when kicked under the 
optimum conditions. 

5. Varying the air pressure from 9 to 15 pounds had no significant effect 
upon the distance that the ball could be kicked. 

6. The medium-high, slowly revolving, end-over-end kick is the result of 
kicking the ball at the optimum height with the optimum tilt. 

7. Height may be obtained either by tilting the ball toward the kicker 
or by striking it lower with the toe. 


(Submitted 1/3/58 
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Abstract 

The past 30 years have added little to our knowledge of attitude measurement in health 
education. Some 15 scales have appeared in the literature, but few of these have been 
successfully refined and standardized. This study critically reviews these scales, and 
brings the interested reader up to date in this all important area. It also poses several 
questions which, it is hoped, will stimulate thought and perhaps even a little research. 
A greater understanding of attitudes and how they are developed and measured is 
necessary to further health education outcomes. This will not be realized until health 
educators themselves lend their time and talents to the solution of the many problems 

in this area. 


THOSE WHO have contributed during the past 30 years to our growing 
fund of knowledge in health education have reason to be proud. The stature 
of the field is such that today it is an accepted part of the school program, 
and general education recognizes the contribution that health education can 
make to the total integration of the student into school and community life. 

While progress has been generally ordered and steady, it has been some- 
what frustrated and uncertain in the area of attitude study and measurement. 
This is primarily because most health educators are neither psychologists nor 
sociologists, and even these have had their difficulties in agreeing upon a 
definition of the concept of attitude, not to mention measuring it. The psy- 
chologist, however, has not retreated from the problem, and can point to 
much in his literature, some of it successful, some of it not so successful, that 
is valid and reliable in measuring this elusive hypothetical construct. 

This aspect of health education will develop only as health educators them- 
selves apply and perhaps even develop some of these measurement techniques. 
Edwards summarizes the situation rather succinctly: 


Knowledge, attitudes and habits have been the basis of our health principles for 
many years. To regard proper health attitudes as an essential objective of health 
instruction, and then not to adequately evaluate them, seems an essential waste of 
time and energy for the teacher and curriculum constructor. It is high time we stop 
talking theoretically about the value of health attitudes and try to be practical and 
measure them objectively (12). 


Purpose of This Study 


This paper will analyze the few actual attempts to measure health attitudes 
that have been developed during the past 30 years. Some specifically claim 
to measure attitudes; others attempt a broader measurement which also 
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includes particular interpretations of habit, practice, and knowledge meas- 
urement. The summary of the few that have actually been standardized is 
made with the hope that the reader will become more acutely aware of the 
great gulf between what has been done and that which needs to be done in’ 
order to attain accepted objectives in this field. 

The fact that many are aware of this gulf is evident in the number of 
articles appearing in the literature during this same considered period— 
articles which advocate continued and intensified consideration of health atti- 
tude measurement. Another brief section of this paper will serve to indicate 
this continued clamor for further attitude study. 

Many classroom teachers are quite proficient in constructing knowledge 
tests, in all likelihood because of the demands put on them by the grading 
system. Some of these classroom teachers standardize their own tests and 
use them to good effect over a period of years. To measure the attitudes of 
their students, however, becomes a somewhat different problem; most feel 
incapable of doing it, and so they think about it but can show no progress. 
The present author is of the belief that a valid attitude scale in two equiva- 
lent forms would tell more about the progress of a particular health class, 
and the effectiveness of that teacher, than would the corresponding knowledge 
test. If this is so, the emphasis on knowledge tests to date has been unjustified. 


Review of Health Attitude Scales, 1927-1957 


In the past 30 years, there have been 15 scales which have appeared in 
the literature or been published and will be considered here. Few of these 
have been standardized. Only three (8, 11, 27) measure attitude, specifically, 
while the others attempt to also measure knowledge and/or habits. This is 
inclusive of all Doctoral and Master’s theses during this period, as well as 
those journals covered by Education Index (journals where most of the writ- 
ing in this area has appeared). These 15 scales, with one exception, will be 
sonsidered in chronological order. 

A study by the American Child Health Association, directed by Franzen 
(15), first appeared in monograph form in 1929. A battery of five tests, 
which included a True-False Test, Matching Test, Story Test, Five Rules Test, 
and Time Test, was designed to measure health education outcomes for grades 
five and six. This work represented a new approach in objective health 
education testing since it attempted to go beyond the limits of pure knowl- 
edge; it was concerned more with the measurement of behavior. In the 
author’s words: 


We must measure behaviors which indicate response under given temptations and 
counter interests. We must know the power of resistance held by given items of 
knowledge and habit. The tests which we eventually hope to have must include such 
phases of response as these, which transcend the limits of “knowledge” alone (15). 


Data from 70 cities throughout the country were used in the refinement 
and standardization of items. A total of 250 items are included in the five 
tests, with high correlations indicated within each test and low correlations 
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between tests; hence, the conviction that many modes of response (behavior) 
are included in the five tests. A recognized weakness is that of curricular 
validity. A jury of four health education experts determined the percentage 
of items to be devoted within each of 15 divisions of subject matter. Again 
quoting from the author: 

There are about 250 items in the tests so that even a per cent as low as 5 means 
about twelve items. We have no evidence at present by which we may argue either 
that this representation is complete or that the relative emphasis is correct (15). 
This battery of five tests proved to be unwieldy for complete administra- 

tion, and a shorter form, authored by Franzen, Derryberry, and McCall (16) 
was published in 1937. Called the Health Awareness Test, it consists of those 
selected items which were most diagnostic in the earlier battery. Correlations 
between this test and the original battery of five were found to be .95 for 
both the fifth and sixth grades. 

Snyder (35) developed a scale to rate fourth grade health practices which 
appeared in the literature in 1931. She was quite emphatic in stating that it 
did not represent an attempt to measure attitudes, since results in the latter, 
“are usually unsatisfactory and not true indications of actual practice.” Her 
scale does not attempt to check daily practices but rather the fundamental 
practices which it is hoped will be established by the end of the fourth year 
of schooling. The scale was apparently never standardized. It contains four 
items to be rated by teacher observation, and four others seeking information 
related to the children’s habits of sleeping and eating (breakfast). 

The present author asks several questions at this point. Is there so great 
a difference between attitudes and practices or habits that we can say we are 
measuring only one area or the other? Is it logical that, as attitudes “im- 
prove,” our habits are “improved”; and that an accurate measure, therefore, 
of habits would in effect be an accurate measure of attitude? Can we more 
accurately measure feelings and beliefs (attitudes) at certain ages than we 
can habits or practices (with the tools now available) and vice versa, thereby 
making one type of scale more valid at a particular age than the other, but 
with both realizing the same ends? Are all attitudes expressed in some form 
of behavior, verbal or physical? If not, can we attribute any real educational 
meaning to them? Do health attitudes begin to form before the individual 
is aware of them? Are they so ingrained that once recognized by the indi- 
vidual they are difficult to alter or re-educate? If they are discovered before 
they pass the threshold of awareness, could the professional health educator 
do a more effective job of education? 

The trend of further literature in the considered period will provide some 
of the answers. Others will be proposed at the conclusion of this paper. 

A Test On Health Habits by McGiffert (28) was written up in 1933. De- 
signed for use on the lower elementary level, it contains 15 items in three 
habit areas: general (clothing, breakfast, teeth, and coffee) ; sleep; and care 
of the teeth. It is assumed that the author used this test personally but no 
indication of refinement or standardization was presented. 
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The Brewer-Schrammel Health Knowledge and Aititude Test (10) was 
published in 1935, and contains 100 items, which have been well standardized 
for grades four through eight. The title actually is a misnomer since sub- 
jective analysis of the test itself indicates that it is 95 per cent, if not more, 
a measure of pure knowledge. There is no attempt to measure the presence 
of health habits or the intensity of health attitudes. In the manual of direc- 
tions, the authors indicate that one use of the test results could be to “deter- 
mine the correlation between information and attitude (to do this it is neces- 
sary to single out the attitude and knowledge items) ,” but they do not indicate 
those items they consider to be measures of attitude; in fact, few appear to be. 

This type of test construction has been criticized by Patty (31) and others. 
When several grades are included in a single test, both reliability and validity 
are sacrificed, since, for example, a superior fourth grade student could only 
be expected to satisfactorily answer one fifth of the questions in a test claimed 
by the author to be suitable for grades four through eight. On the other hand, 
a scale based on only one grade, which includes no words or sentence struc- 
ture too difficult for the students normally found in that grade, would cer- 
tainly have more diaglostic value than one claiming to be effective for several. 

A scale to measure the attitudes of students toward their school environ- 
ment was authored and published by Bell (7) in 1937. It has been used 
successfully in the counseling of high school students. Most of the items 
are concerned with the individual’s relationship to those around him—his 
teachers and fellow students. Reliability, using the split-half method with 
‘Spearman-Brown correction, is .94++.004. The School Inventory is validated 
by two methods—discrimination between upper and lower 15 per cent and 
teacher ratings—and both indicate that it is measuring what it purports to 
measure. If used in addition to those tests and scales which evaluate specific 
instructional areas, this Bell scale can be extremely valuable. Such use has 
been reported by a few authors (9, 14, 39) who have conducted school and 
instructional evaluation programs. Continued emphasis along these lines 
needs to be encouraged. 

The Byrd Health Attitude Scale (11), published in 1940, has enjoyed the 
most popularity of those available. Based on the measurement technique 
devised by Likert (23), it contains 100 statements, each presenting a policy 
or activity to which the student responds by underlining one of the five 
options. Designed for use with the high school and college students, easily 
understood directions and simplified scoring are among its several attributes. 
In the supplementary sheet containing instructor information the author 
makes the following statements in his concluding paragraph: 

There is some preliminary evidence that an individual must strongly agree with a 
favorable health practice if his behavior is to be modified. If further experimentation 
bears out this conclusion the significance of measuring the intensity of favorable atti- 
tudes becomes greater. The conclusion above would indicate that individual score 
should range above 400 points before we can expect attitudes to be transmitted into 
desirable health behaviors. In other words, low scores are apt to be more significant 


than moderately high scores, because such ratings would indicate that there is not 
sufficient intensity of favorable attitude to insure sound health behavior (11). 
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Does consideration such as this tend to split hairs? In health education 
more than any other subject matter, we do not do a successful job until we 
favorably change behavior. “Intensity of attitude” up to the point where, 
but before, behavior is affected is a construct with little conscious meaning. 
If we ask the student to define the degree of his intensity, we ask him to do so 
in the light of various behaviors he has demonstrated in his recent past. 
Perhaps, then, our choice is one of two alternatives. To accept behavior 
as the most valid expression of developed attitudes and attempt to educate 
or re-educate in light of this or to develop techniques which will measure 
attitudes before the individual can actually define or is conscious of them. 
If this is possible, the student would be unable to practice deception, as he 
can in most attitude scales currently available. 

Johnson (20) developed a Test on Health Habits. It contains 16 statements 
(ie—“12. I’m not very hungry, Mother. I believe I'll just eat pie.”), and 
suggests that they be used as a review, test, or as a basis for discussing the 
health habits of the group. No standardization or explanation of personal use 
and success is presented. 

Neher’s Health Inventory for High School Students (30) appeared in 1942 
and was closely followed by the Johns Health Practice Inventory (19). The 
Neher Inventory is divided into two parts. The first includes 20 items of 
practice under the heading “What You Do About Health” while the second 
part is a test of knowledge or “What You Know About Health.” The stated 
purpose in the manual of directions is to bring health practices in line with 
health knowledge. Hence, this scale might show a student to be only average 
in his health practices, but high in his health knowledge. Such information 
would, therefore, be useful in individual and group counseling. Norms, based 
on 5,000 cases, are readily available for grades 9-12 in the manual of 
directions for the Neher Inventory. 

While the Neher Inventory offers from two to four responses to each of its 
20 health practice items, with most of them only three, the Johns Inventory 
has 36 specific practices and allows. for a greater variability with five 
responses to each. Valid for junior and senior high school and the first 
two years of college, the Johns Health Practice Inventory asks the student to 
check either never, rarely, sometimes, usually or always to each of the prac- 
tices. Checks in the never column receive one point; rarely receives two; 
and they progress to always, which allows five points. Top score is 180, while 
the lowest is 36. Again it is noted that use has been made of the Likert method 
(23). 

Many groups have sought to evaluate community, county, and even state 
school health programs. Some have used standardized measuring scales 
available, while others have developed their own. A study by Southworth, 
Latimer, and Turner (37) is an example of the latter. During 1941-42, under 
a joint committee on health education of the Massachusetts Departments of 
Education and Public Health, an evaluation was made of the health education 
programs in selected Massachusetts senior high schools. Twenty-seven schools 
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with a total of 15,480 pupils in grades 10, 11, and 12, representing about 
9 per cent of the high school population, took four paper and pencil tests 
previously designed by the committee to measure practices, knowledge, atti- 
tudes, and interests. The health attitude test, titled “Your Opinion, Please,” 
appears to be similar to others already reviewed and must be so criticized. 
Three of the statements it contains are shown here: 

There is no danger of disease from raw, unwashed fruits and vegetables. 

Deodorants which stop perspiration are undesirable. 

A city should collect and dispose of household garbage (37). 
These call for a response primarily based on knowledge rather than attitude. 
For example, many cities do not collect garbage, Corvallis, Oregon, being 
one, and this service is provided by a private company to which the house- 
holder must subscribe. In the structuring of city services, this has never been 
considered a tax item. By contrast, it is the accepted duty of towns and cities 
in Massachusetts. These are facts, however, on the basis of which an answer 
can be made to this statement, and it does not seem that the same question 
can be considered an investigation of attitude. Not all the statements are of 
this sort, but there are a sufficient number of this sort to raise a question 
as to its validity as a measurement of attitude. 

Generally, the study proved valuable in evaluating the health programs in 
Massachusetts. To the present author’s knowledge, none of the four scales 
used have been standardized and published. 

One of the more significant contributions to health attitude measurement 
has been the work of Boyd (8). As part of the Sloan Experiment in Applied 
Economics conducted by the University of Kentucky, he developed two instru- 
ments for measuring attitudes toward desirable food practices. The first was 
an attitude questionnaire in three parts (opinion questions pertaining to 
gardens, food storage, and a well-selected diet) which followed in design 
the Thurstone technique of equal-appearing intervals (40). This technique 
follows eight well-defined steps: collecting a large number of opinion items, 
sorting these items by several judges into 11 categories ranging from least 
favorable to most favorable, tabulating the position for each item as deter- 
mined by the several judges, assigning a mean value to each item, testing 
for ambiguity, testing for irrelevance, measuring validity, and checking reli- 
ability. In their final form, the three scales contained 68 opinion questions 
approximately equally divided between the three areas. Results satisfactorily 
indicated that the median scores on the scales represented the attitude status 
of the students tested toward desirable food practices. 

In this type of attitude scale, as with those using the Likert approach (11, 
19) and also a behavioristic technique (27), the subjects must be sufficiently 
conscious ‘of their own attitudes to be able to answer opinion or behavior 
questions. In other words: 

The attitudes must have become crystallized or established beyond the threshold of 


awareness, so that the subjects could, by introspection, answer questions which pre- 
sumably reveal the nature and extent of the attitudes (8). 
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In these methods there is no consideration for the origin, evolution, and 
growth of attitudes. 

The second Boyd test is unique in its approach and attempts to eliminate 
the problem mentioned above. Although most of the work in attitude meas- 
urement has been concerned with highly controversial issues such as religion, 
racial superiority, and birth control, this test is concerned with what consti- 
tutes a desirable food practice—an issue about which people would not be 
in wide disagreement. Opinions tend to vary in degree rather than direction. 
Boyd states the problem thus: 

It follows that measurement in this case is more difficult than in the case of con- 
troversial issues, because discrimination within a relatively homogeneous group is 


more difficult than discrimination among groups composed of widely divergent types 
of subjects (8). 


He based his selection of the free association technique on three reasons. 
First, attitudes favorable to desirable food practices do not suddenly appear 
but are the result of a process of growth. The method used, therefore, had 
to be sensitive enough to measure small differences in attitude in the early 
stages of their development. Second, it was felt that this method would per- 
mit the tapping of attitudes before they were evidenced through overt be- 
havior or, as expressed earlier, still below the threshold of awareness. Third, 
response to the free association stimulus items are not affected by the indi- 
vidual’s knowledge of the right or more acceptable answer. 

Free association testing has been used primarily for diagnosing maladjust- 
ment. The most familiar of these is the Kent-Rosanoff test, as described by 
A. L. Knutson (21) : 

In this test, one hundred common words are offered one at a time to the subject, 
who responds by the first word that comes to him. By the use of the frequency tables 
based on the responses of one thousand individuals, the “individuality” of each 
response can be determined. The theory underlying such a test is that the person 
who gives many very individualistic scores may have very unusual associations. It may 
be that he has a better than average or a highly specialized vocabulary. He may, 
however, be psychopathic and it is for the detection of psychopathy that the test was 
developed (21). 

Application of this method to the area of attitude testing has been attempted 
in only a few instances. Farquier (13) studied the attitudes of delinquent 
boys, and Meltzer (29) investigated the attitudes of children toward their 
parents. Boyd’s use of it is the only one in any area even remotely related 
to health. 

Briefly, the seven steps followed by Boyd were: selection of an initial list 
of stimulus words, elimination by judges of the least associative words, pre- 
liminary testing of 22 stimulus words with 148 elementary school pupils 
and 65 university students, elimination of seven more unassociative words, 
administration of 15 remaining words to large group of children, determina- 
tion of favorableness of responses by judges on continuum from 0 to 5, 
scoring of papers, and establishment of norms. 

The attitude scale and the free association test of this study had a cor- 
relation with each other of .31+.04. Both showed only a slight correlation 
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with chronological age, intelligence quotients, or achievement quotients. This 
would indicate that whatever is being measured is neither closely related 
to, nor necessarily dependent upon, intelligence or academic achievement. 
Boyd felt that the .31 correlation was due in part either to the difference in 
what was being measured by the two tests or to the difference in the stages 
of development of the attitudes being measured. In Boyd’s words: 


The attitude questionnaire is designed to measure such attitudes as may be repre- 
sented by opinions which pupils endorse, while the free association test probably 
taps in their early stages of growth some of the attitudes which are not yet developed 
to the point of awareness; therefore, the two sets of scores would not be expected to 
have a high positive correlation (8). 

This type of attitude scale, the free association technique, has much to 
recommend its continued investigation by health educators. 

The Begbie Health Knowledge and Attitude Test (6) for grades 4-8 con- 
tains no more than 13 items of a total 87 that can be considered attempts to 
measure feelings or beliefs. The remainder are straight knowledge and all 
are either of the true-false (items 1-64) or the multiple choice (items 65-87) 
type. Like a few already reviewed, it is an attempt to be all-inclusive and 
results are, therefore, diluted. 


Behaviorism is construed by many as the only objective measure of attitude 
we can ever hope to effect. Sumner, at the end of the last century, said: 

One can be truly said to believe only what one acts as if he believes, and the 
gauge of his convictions is the extent to which it is embodied in his actions (38). 

At a more recent date, Bain has said that the crux of the controversy in 
attitude research lies in the confusion between opinions and attitudes, the 
inseparability of attitudes and values, and the identification of attitudes with 
hypothetical, subjective states of consciousness. In Bain’s words: 

Feelings, sentiments . . . attitudes, and so on, mean nothing and worse than 
nothing, unless they are interpreted as overt behavior of some kind. . . . In other 
words, we cannot speak of the existence of attitudes or wishes or sentiments or any 
other phenomena of consciousness except as they are manifested in overt behavior. . 
So we may say, an attitude is the relatively stable, overt behavior of a person which 
affects his status (5). 

A Health and Safety Attitude Scale based on behaviorism has been devel- 
oped by Mayshark (26) and reviewed in the March 1956 Research Quarterly 
(27). Statistically quite successful, the technique gives promise as one that 
might be utilized more completely on other grade levels. It is suggested that 
the interested reader refer to the Research Quarterly article for a complete 
examination of the standardization techniques followed. 


Finally, mention is made by Edwards (12) of a high school attitude scale 
currently under construction which modifies the Likert technique (23) to 
contain both behavioral and emotional features in each of several short 
stories. He believes this approach will do much to eliminate the unrelated 
and ambiguous short questions and still provide the student with essential 
information necessary to get a valid response. 
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Review of Periodical Literature Concerned with Attitudes 


Southworth (36) has asked the question, “To what extent are we bridging 
the gap between health knowledge and health behavior; can we better apply 
what the psychologists have learned about human motivations?” 

Shaw (34) more recently asks, “Should we attempt to measure health 
-attitudes and behavior as a means of evaluating the success of a health teach- 
ing program?” 

Knutson (21) states this problem of evaluation in health education thus: 

When we speak of ways of evaluating health education, we are really speaking of 
ways of determining how far and how successfully ‘students have moved along the 
dynamic paths of learning. What steps have they taken toward learning, applying and 
integrating the principles of health education into their behavior as they try to solve the 

problems they face in daily life (21). 

Others have written at length of the need to develop (1, 2, 4), teach (17), 
creaie (24, 25, 33), evaluate (18, 32), and crystallize (3) more positive 
health attitudes. These are only a few of those who have written of the need 
for understanding and research in this area. Does it not follow that if we 
stimulate research, a greater understanding will develop? 


Summary 


The past 30 years have seen very little by way of attitude measurement in 
health education. It is possible to categorize that which has been done into 
the following four areas. 

First, several authors (20, 28, 35, 37) have reported personal use of a 
scale to measure either habits or attitudes. In these cases, these scales have 
not been refined or standardized and are generally characterized by their 
subjectivity of administration. The Massachusetts study (37) is by far the 
most extensive. Here a scale to measure attitudes was constructed and ad- 
ministered to 15,480 high school students. Results of this sampling are given, 
but the literature indicates that the scale itself was not refined. Whether or 
not it was truly effective is not indicated, and no other groups have used it 
in evaluation studies. 

Second, four others (6, 10, 16, 30) claim to measure knowledge as well as 
attitudes, but careful analysis of the scales themselves reveal few items that 
measure the construct attitude as we think of it. The Health Awareness Test 
(16) is perhaps the most statistically sound of the group, and the one that 
comes closest to evaluating attitudes. 

Third, the Bell School Inventory (7) and the Johns Health Practice 
Inventory (19) are two well-refined and standardized scales which fringe 
the area of attitudes. Bell’s scale measures the student’s attitudes toward his 
school environment, teachers, and fellow pupils, while Johns’ scale measures 
actual habits and practices in 36 specific instances. 

Fourth, there are now available in the field four well-developed scales by 
three authors (8, 11, 27) which make use of four decidedly different psycho- 
logical techniques to measure attitudes and a fifth scale which should be 
ready for use soon (12). Byrd (11), who applied the Likert technique (23) 
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was the first in the field to attempt specific measurement of attitudes, and, 
as mentioned earlier, has enjoyed the most popularity. Boyd (8) developed 
two scales to measure attitudes toward desirable food practices: the first using 
Thurstone’s equal-appearing interval technique (40); the second using a 
psychiatric contribution to attitude measurement—the free association tech- 
nique. Mayshark (27), after the thinking of Bain (5), Sumner (38), and 
others, developed an attitude scale suitable for the seventh grade using 
behavior situations and known as the situation-response technique. Edwards 
(12) has written of a forthcoming attitude scale suitable for high school 
students modifying the Likert technique (23), to include behavior elements. 


Conclusion 

In a field that is still developing, we need to stimulate a greater interest in 
and understanding of the psychological aspects of health education. The 
little that has been done to further attitude measurement gives the promise 
of eventual success in realizing this important health outcome. There still 
remains enough to keep many Doctoral candidates and interested researchers 
in this field busy for many years. 

Many criticisms and questions have developed during the progress of this 
paper. The criticisms are meant to point up particular needs that have been 
found lacking; the questions seek no conclusive answers because this is 
impossible, but have been asked to stimulate thought in this area. 
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A Safety Attitude Scale 
for the Seventh Grade’ 
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Mount Pleasant, Michigan 


Abstract 

This study represents an attempt to construct an attitude scale which serves as an 
instrument to measure the attitudes of seventh grade students toward safety. A behavior- 
istic-type question (situation-response) was adopted for this study as an expression of 
attitudes. The major steps followed were: selecting a measuring technique, developing 
a preliminary scale, refinement of preliminary scale, and establishment of final forms and 
norms. Validity was established and through the use of interform reliability method, a 
reliability coefficient of .897 resulted. 


BECAUSE TODAY’S fast pace of living affects the safety of young and old 
alike, educators are becoming increasingly aware of the importance of atti- 
tudes and opinions in the behavior of young people of school age in situations 
involving safety in the home, community, and schools. The school, together 
with the parents and the community, has a responsibility towards personal 
safety and the safety of society in general. 

Not only does the school have the responsibility to formulate desirable 
safety attitudes, it also has the rare opportunity to measure and evaluate such 
attitudes in order that the student may live a safer, longer, and more vigorous 
life. 

Many teachers have accepted the engendering of desirable attitudes as a 
part of their instructional duties and as desirable outcomes of education. 
The development of attitudes in health and safety education courses has been 
accepted rather generally as an objective of health and safety education; 
however, health and safety teachers have directed little attention toward its 
attainment. This has been due, in part, to the lack of an instrument with 
which the health and safety teacher might evaluate health and safety attitudes. 
People in health and safety education have done little in atttiude measure- 
ment. In other words, teachers should attempt to do more than give “lip 
service” to the development and measurement of attitudes. Grout (6), Ober- 
teuffer (16), Stack (23), Langton and Anderson (11), and Patty (18) are a 
few who have stressed the need of health and safety attitude measurement. 

The purpose of this study was to construct a reliable and valid instrument 
which can be used to evaluate the attitudes of seventh grade students toward 
certain areas in safety. 

The seventh grade was chosen because many child development experts 
believe that the period from 11 to 13 years of age is a formative stage in 


1 This study was made in partial fulfillment of the requirements for the degree of 
Doctor of Health and Safety in the School of Health, Physical Education, and Recreation, 
Indiana University, Bloomington, Indiana, September 1955. 


320 





Seventh Grade Safety Attitude Scale 32! 


the development of attitudes. Conklin (4) and Remmers and Wheeler (20) 
support this position, and Mayshark (15) also accepted this thesis when 
he directed his attitude scale at the seventh grade level. By attempting 
to measure safety attitudes at a specific grade level, educators will have 
an opportunity to fight the number one killer of young people in the United 
States today—accidents. This does not mean, however, that safety scales 
should not be constructed for use at each grade level or that attitude measure- 
ment should be minimized in other grade levels. 


Selection of an Attitude Index 

An attempt to measure an attitude is an attempt to measure an intangible. 
No matter how real an attitude may seem to its possessor, it cannot be meas- 
ured directly. The two recognized ways in which an attitude may be expressed 
are in non-verbal behavior and in verbal or symbolic behavior. Either of 
these expressions of an attitude may be used as an index to the attitude. 

For the purpose of this study, the investigator accepted Bernard’s definition 
of an attitude (1), which is: “An attitude is partial or symbolic behavior 
preparatory to overt adjustment and is transformed into true overt adjustment 
behavior as the adjustment proceeds.” 

This definition lends itself to the behavior situation (situation-response) 
concept followed in this attitude-scale construction, since the objective is what 
a person says he would do in a variety of specific safety situations and not 
what a person says he believes. Thus, the attitudinal behavior of the student 
can be known by and communicated to someone else only through its overt 
symbolic responses. 


Review of Related Studies 


Probably one of the earliest, and certainly one of the simplest, methods 
for determining an attitude was the use of the questionnaire. One of the first 
questionnaires was developed by Harper (7) in 1925, and a more elaborate 
use of the questionnaire method in determining attitudes was employed by 
Katz and Allport (8) in 1931. 

Arbitrary scales were devised and used by Stagner (24), Kirkpatrick (10), 
Lenz (12), Likert (13), Rosander (21), and Pace (17). The most renowned 
experimental scale in attitudinal measurement is the method of equal- 
appearing intervals created by Thurstone and Chave (26). Remmers (19) 
devised a modification of the Thurstone Scale which makes it~ possible to 
measure a large number of attitudes by using a single scale. 

Reliable, objective, and valid tests in the area of attitude testing are 
lacking in the field of health and safety education. The situation is especially 
critical in terms of good tests for measuring health or safety attitudes of 
school-age children. There have been some attempts to measure health 
attitudes as attested by the work of Franzen (5), Brewer and Schrammel (2), 
and particularly by Byrd’s Health Attitude Scale (3) designed to measure 
health attitudes of the group or the individual. Byrd’s scale may be classified 
as a rating scale, patterned after the Likert technique (13). Mayshark (15) 
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TABLE 1 


Number and Percentage of Items According to Location and Type of Accident 
(Derived from Indiana Health Textbooks and State Courses of Study) 


Number 
Location and Type Percentage of items 








Home : 

Falls 6.67 
Firearms 3.33 
Burns, scalds, explosions 3.33 
Electricity 1.67 
Suffocation 1.67 
Poisons, poisonous gas 1,67 
Fire ; 1,67 


Total 20.01 


Community 
Pedestrian 8.33 
Bicycle _ 8.33 
Vehicle-occupant 3.33 
Water 1.67 
Playground (not school) _. 1.67 


23.33 
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Organized activities 8.33 
Classroom-auditorium 
Shower 
Vehicle-occupant (school jurisdiction) —— ~~ 5.00 
Gymnasium ____. 5.00 
Apparatus 5.00 
Stairs-stairway 5.00 
Corridors 5.00 
Pedestrian (school jurisdiction) 3.33 
Vocational shops 3.33 
Bicycle a 3.33 


Total 56.66 
Grand total 100.00 
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was the first person to use the situation-response, behavioristic-type item in 
a health attitude rating scale. Siebrecht’s Attitude Scale (22) is directed at 
driver training and not toward general safety as it applies to the home, 
school, and community. 


Procedures 


An instrument patterned after the situation-response attitudinal measure- 
ment technique as used by Pace (17), Rosander (21), and Mayshark (15) 
was adopted for this study. A multiple-choice, four-option item was used, in 
which the safety situation was outlined in the stem of the question and was 
followed by four possible ways of reacting or behaving in that particular 
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situation. The alternatives ranged from least desirable to most desirable, with 
two positions somewhere between these two extremes along the attitude con- 
tinuum being measured. Even though the instrument is designed to measure 
attitudes, knowledges or understandings have a direct bearing in the feelings 
held by an individual; this being the case, knowledges are being measured 
to a certain extent. 

The 1954 edition of Accident Facts (National Safety Council, Chicago) 
served as a guide in determining which general areas and what percentage of 
emphasis per area would be used as the basis for scale content. The three 
general areas decided upon were school, community, and home. These in turn 
were broken down into the different types of accidents involved within each 
area, all of which comprised a table of specifications as shown in Table 1. 

After careful analysis of the Indiana state-adopted health textbooks and 
several state health courses of study (see Appendix A), statements pertinent 
to safety were developed into situation-response questions covering and 
including all the areas in correct percentage of emphasis. As a result of this 
procedure, it was felt that curricular validity was established and that a true 
picture was obtained of the safety attitudes a seventh-grade student should 
possess. 

At this stage, items were constructed for each safety attitude with a total 
of 200 items as the desired goal for the entire work. Each item was written 
in the vocabulary of the seventh grader as determined by Thorndyke and 
Lorge (25). A total of 188 situation-response items were mimeographed and 
placed in test booklet form, with the items appearing by areas just as they 
were constructed. The items were assigned numerical weights by a select jury. 
This jury was made up of 22 graduate students with experience in the field 
of health, physical education, recreation, and safety plus five professors in 
this field. 

The judges were asked to give a numerical value of 4 to the alternative 
they thought to be the most desirable response in view of the safety situation 
mentioned in the stem; a numerical value of 3, to the next most desirable 
response; a numerical value of 2, to the next to least desirable response; and 
a numerical value of 1, to the least desirable response. Sixty per cent agree- 
ment among the judges was the criterion arbitrarily set for establishing the 
proper order of the alternatives. In terms of the weight assigned to each 
alternative, 165 of the original 188 items met the criterion for acceptance. 

On the basis of their attitude object, the 165 items within each location 
area were divided into three separate stacks; each stack received an equal 
number of items pertaining to each area. These stacks were called Prelimi- 
nary Forms A, B, and C. The items related to each locational area and sub- 
area, or injury type, were placed in similar positions in each of the three 
forms. Logically arranging the items within three forms in this manner is not 
actually a basis for determining equivalent forms; however, it does give as- 
surance, even though the items are different within each area and sub-area, 
that an attempt is made to measure the same attitude object. 





nm 
6 
z 
o 
~ 
3 
> 
= 
® 
t 
6 
S 
Cc 
<= 
2 
5 
3 
3 
a 
@ 
& 





9S LT 
661 

96°261 

Z1Z-P2T 
Lak 


S82°LT 
661 
06'S6I 
212-021 
OZT 


8S2'8I 
661 
99°16 
G1Z-L2L 
16 


08'IZ 
002 

00°Z6T 

81Z-SOl 


6ev IL 
S0Z 
0102 
STZ-SéT 
oll 


[bl°lZ 
161 
Test 
81Z-SOL 
ra ai 


P9P'ST 
661 
$9°E6T 
ElS-VsT 
Z0Z 


86r'IT 
002 
IS L6l 
€1Z-sSbl 
16 


PLY LI 
861 
80°06T 
Cle PET 
SOT 


~ UOnBiAsp prepueBig 





1eI0L, 


sItID 


sftog 


THIOL 


bed ob 3.91 


ssog 


Tes, 


STAID 


siog 





QO wag 





q wi0g 





Vv wio0g 








2 pun ‘gq ‘p susog Asvurmyasg fo uoypsstuimpy ayz fo siynsay ayi fo siskyoup pooysunig 


& WIGVL 





Seventh Grade Safety Attitude Scale 325 


It was decided to use in this study only the schools in the third-class cities 
of Indiana. According to the 1950 census, there were six cities falling into 
the third-class category of 20,000 to 35,000 population. A letter was sent to 
the superintendents of the schools in the six cities, requesting permission to 
administer a safety rating scale to their seventh grade students. All of the 
superintendents promptly replied, giving permission for the administration 
of the preliminary forms of the safety attitude scale to their seventh grade 
students. 

The investigator then visited each of the six cities and personally adminis- 
tered the safety attitude scale to a total of 718 seventh grade boys and girls. 
The students were given instructions prior to the test, and they were told 
how to use properly the IBM answer sheets. A total of 83 answer sheets were 
rejected because of incomplete answers, marking of two answers, and failure 
to complete the scale in the allotted time. Table 2 gives the results of the 
preliminary administration of the scales in terms of variability, reliability, 
and measures of central tendency. 


ITEM VALIDITY 


Statistical validity of the three forms was determined by a technique em- 
ployed in three previous studies by Kelley (9), Vernon and Allport (27), 
and Mayshark (15), all of whom used critical ratios as a measure of item 
discriminating power. A critical ratio of 3 was set as the criterion for item 
acceptance in this study. On the basis of total score, the upper and lower 
27 per cent of the cases on each of the three forms were separated, and the 
critical ratio for each item was computed. Of these items, 137 had the desired 
critical ratio of 3 or better. 


CRITERIA FOR EQUIVALENCE 


For the purpose of equating the final forms, the items were paired on the 
basis of comparable critical ratios and item content, with Final Form A and 
Final Form B each receiving one item of each pair of items. The 60 pairs 
of items were randomly placed in the final forms by use of Lindquist’s table 
of random numbers (14), and each of the four alternatives for each item was 
randomly placed within that item. 


Results 

Final Forms A and B were administered to a new population sample of 
seventh grade students of the six third-class cities of Indiana. The final forms, 
along with a manual of directions, IBM scoring sheets, and IBM pencils, were 
sent to the participating schools and with only one exception were adminis- 
tered by the classroom teachers who had assisted the investigator during the 
preliminary testing. In the one exception where the classroom teacher could 
not administer the scales because of a heavy schedule, the investigator himself 
administered Final Forms A and B to the group of seventh grade students. 

A total of 341 forms were completed of Final Form A and 337 of Final 
Form B. The statistical analysis of the results of the administration of Final 
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TABLE 3 
Statistical Analysis of the Results of the Administration of Final Form A 
and Final Form B 


Form A Form B 
Boys | Girls | Total Boys | Girls | Total 
155 186 341 173 164 337 
121-231 129-232 121-232 114-232 144-224 114-232 
cheeses 196.626 213.274 205.707 191.428 205.817 198.430 
I a 208 217 215 197 211 207 
Standard deviation_ 28.146 15.128 22.419 24.501 17.295 22.561 




















TABLE 4 


Statistical Summary of the Differences Between the Scores of the Girls and the Boys 
on Final Forms A and B 





Girls 


Boys 


Difference 


Critical 
tio 





Mean 





Median 

Standard deviation 
Mean 
STS 
Standard deviation ___. 








213.274 
217 
15.128 
205.817 
211 
17.295 


196.626 
208 
28.146 
191.428 
197 
24.501 








13,389 
14.000 
7.206 
16.648 
9.000 
13.018 








Form A and Final Form B is shown in Table 3. 

Table 3 shows that there was a difference in the results of boys and girls 
on Fing'l Form A and Final Form B. In view of the existing differences be- 
tween the boys and the girls on the various statistics, it seemed desirable to 
analyze'the difference between the scores of the girls and the boys on Final 
Forms A and B. Table 4 shows the relationship between the girls and the 
boys on the two final forms. 


EQUIVALENCE OF FORMS 


On the basis that there was a significant difference between the boys and 
girls on both forms, it seemed logical that the equivalence of the forms should 
be ascertained as applied to the girls on both forms and to the boys on both 
forms, rather than determining the significance of the difference between the 
two forms. Table 5 summarizes this comparison, and the data indicate that 
Final Forms A and B approach equivalence in all statistics except for the 
difference between the means of the girls which had the higher significant 
critical ratio of 4.824. 


INTERFORM RELIABILITY 


Authors of many standard psychological and educational tests employ the 
equivalent or parallel forms method of determining reliability when alternate 
forms are available. The interform method, another term for the equivalent 
or parallel method of determining reliability, was used in this study. 
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TABLE 5 
Statistical Summary of the Equivalence Between Final Forms A and B According to Sex 





Girls Boys 


Statistic 





Critical 
ratio 
.603 
2.982 


Form A 


213.274 
217 


Form B | Difference) 


205.817 | 8.457 
211 6 


Form B | Difference 


191.428 | 5.208 
197 1l 





Mean __. 
Median .... 
Standard 


deviation 





15.128 | 17.295 | 2.167 1.746 24.501 3.645 1.752 




















TABLE 6 
The Mean, Standard Deviation, Reliability Coefficient APearson r), and Standard Error 
of Final Form A and Final Form B Resulting from the Administration 
of These Forms to Each of 124 Seventh Grade Students 





Statistic 


Form 


Girls 


Boys 


Total 





Mean 





Standard deviation 


Pebineth- 2 oid 


Standard error 


211.59 

206.07 
17.635 
13.856 
872 


0288 


192.75 
187.05 
25.495 
26.115 
891 
0278 


202.23 
197.64 
23.466 
22.332 
897 
0176 

















A new sampling of seventh graders from two more schools in the third- 
class cities participated in this phase of the study. The two forms were dis- 
tributed alternately to the participating students. A week after the first 
administration of the final forms, each student completed the opposite form 
from the one he had initially received. Care was taken to insure that the 
students receiving Form A at the first administration would receive Form B 
at the second administration. As a result of this phase of the study, a 
coeficient of reliability between the scores made by 124 seventh grade stu- 
dents on the two forms was .897, as determined by this interform method. 
The high agreement between the forms indicates that each form did a fairly 
accurate job of measurement and that the two forms are closely equivalent. 
Table 6 gives the statistical summary of this administration for interform 
reliability. 


ESTABLISHMENT OF NORMS 


From the data in Table 6 were found the differences between: the means 
and the standard deviations of the girls and boys on Final Form A; the girls 
and boys on Final Form B; the girls on Form A and Form B; the boys on 
Form A and Form B; and of the total 124 seventh graders on Final Form A 
and Final Form B. 

For the girls and boys on Form B, a critical ratio of 4.839 was found for 
the difference between the means, and a critical ratio of 2.741, between the 
standard deviations. 
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For the girls and boys on Form B, a critical ratio of 4.839 was found for 
the difference between the means, and a critical ratio of 4.381, between the 
standard deviations. 

For the girls on Forms A and B, a critical ratio of 2.029 was found for 
the difference between the means, and a critical ratio of 1.973, between the 
standard deviations. 

For the boys on Forms A and B, a critical ratio of 1.147 was found for 
the difference between the means, and a critical ratio of .178 for the difference 
between the standard deviations. 

For the total 124 boys and girls on final Forms A and B, a critical ratio 
of 1.572 was found for the difference between the means, and a critical ratio 
of .549 for the difference between the standard deviations. 

Four of the above comparisons had a critical ratio under 3, which means 
that there was no significant difference, and the forms were closely equivalent 
in terms of means and standard deviations. 

However, a significant difference existed between the means and standard 
deviations of the total girls and boys on Final Form B, with a critical ratio 
of 4.839 between the means, and a critical ratio of 4.381 for the difference 
between the standard deviations. The fact that the lowest score for the boys 
on Form B was 115, as compared to the lowest score of 146 for the girls on 
the same form, partially explains the significant difference existing between 
the sexes on Final Form B. Actually, three boys received a score lower than 
the lowest score recorded for the girls. 

As a result of administering both final forms to the same population, 
statistical results in terms of equivalency of forms were revealed contrary 
to what had already been determined earlier in the study. Table 5 shows that 
a significant difference exists between the means of girls on Final Form A 
and Final Form B, whereas, after administering both forms to the same 
population of seventh grade girls, a significant difference did not exist be- 
tween the means. In the first case, a critical ratio of 4.824 was obtained; 
and in the second case, a critical ratio of 2.029 was obtained. The latter 
critical ratio is the more nearly valid figure, because the same population 
was used in the administration of both forms, and this evidence substantiates 
the close equivalency of Final Forms A and B. 

As a result of the existent significant difference between both means and 
the standard deviations of the total 124 girls and boys on Final Form B, 
it was decided to establish percentile norms for both sexes on both forms for 
the seventh grade population in the six third-class cities of Indiana. 


Conclusions 


The following conclusions were drawn from this study: 

1. By using the interform or parallel form method for determining reli- 
ability between two equivalent forms, a reliability coefficient of .897 was 
found between Final Form A and Final Form B. A reliability coefficient of 
.897 is very high in terms of attitude measurement. 
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2. Final Forms A and B are valid measures of the safety attitudes of 
seventh grade students attending schools in cities of the third class in Indiana. 

3. Form A and Form B are objective in terms of item construction, ad- 
ministration of the scales, scoring of the answer sheets, and interpretation of 
the statistical results. 

4. Form A and Form B are closely equivalent forms. 


10. 
11. 
12. 
13. 
14. 


15. 


16. 
17. 


18. 
19. 


21. 
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APPENDIX A 


TEXTBOOKS AND COURSES OF STUDY 


Indiana State Adopted Health Textbooks for the Junior High Schoo! Level 
Laidlaw Brothers, New York, The Road to Health Series. 
Jones, Edwina, Bertine Maloney, Edna Morgan, and Paul E. Landis, Health Trails, 
1949, 256 pp. 
Jones, Edwina, Edna Morgan, and Paul E. Landis, Your Health and You, 1949, 320 
pp.; Keeping Healthy, 1949, 320 pp. 
Ginn and Company, Boston, Safe and Healthy Living Series. 
Andress, J. Mace, I. H. Goldberger, Marguerite P. Dolch, and Grace T. Hallock, 
Safety Every Day, 1945, 258 pp. 
Andress, J. Mace, I. H. Goldberger, and Grace T. Hallock, Doing Your Best for Health, 
1945, 298 pp.; Building Good Health, 1945, 298 pp. 
Scott, Foresman, and Company, New York, Health and Personal Development Series. 
Shacter, Helen, and W. W. Bauer, The Girl Next Door, 1948, 256 pp.; You, 1948, 288 
pp.; You and Others, 1949, 288 pp. 
State and School Courses of Study 
Arizona, State Department of Education. Course of Study, Elementary Schools ef Arizona, 
1946, 165 pp. 
Board of School Trustees, Bloomington, Indiana. Safety for Elementary Grades. 1942, 
50 pp. 
Nebraska, Department of Public Instruction. Science for Nebraska Elementary School 
Children. 1951, 204 pp. 
New Mexico, State Department of Education. Curriculum Guide for Elementary Schools 
in New Mexico. 1950, 139 pp. 
New Mexico, State Department of Education. Teachers Guide in Safety Education for 
Elementary Schools. 1951, 50 pp. 
Oakland (Calif.) Public Schools. A Graded List of Safety Learnings for Use in Elemen- 
tary Schools. March 1942, 16 pp. 
University of the State of New York Bulletin. Safety Education. No. 1324, Sept. 1946, 
74 pp. 
Virginia, State Board of Education. Planning Together for Health. June 1950, 65 pp. 


APPENDIX B 
FINAL FORMS A AND B 


Because of space limitations, only the first ten sitwation-response items are given for 
each form. Both Forms A and B contain 60 situation-response, multiple-choice items. 
Form A 

1. The sidewalk in front of your home is covered with ice. You would: 

1. put salt or ashes on the ice. 

2. tell your parents about the ice on the sidewalk. 
3. run and slide on the ice. 

4. be very careful when walking over it. 








10. 


For 
1 


> 2 pene 
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. Your six-year-old sister finds a firecracker and starts to light it. You would: 


1. tell your sister how to light the firecracker. 

2. destroy the firecracker. 

3. watch her try to light the firecracker. 

4, take the firecracker and match away, and light it yourself. 


. Your friends are roughing it up in a shower room, and they ask you to play with 


them. You would: 

1. tell your friends to be careful. 

2. tell your teacher about the rough play in the shower room. 
3. join in the scuffling. 

4. refuse to play “rough” while in the shower room. 


. You see a teammate catching in a softball game without a head mask. You would: 


1, tell your teammate to wear a mask when he is catching. 
2. refuse to play ball until your teammate puts on a mask. 
3. throw a ball at his head to scare him. 

4. say nothing to your teammate. 


. Your teacher outlines the safest way for you to go to school. You would: 


1. take your own route which you think is better than hers. 
2. pay no attention to her suggestions. 

3. follow the route she outlines. 

4. follow the shortest route regardless. 


. You are walking down the railroad tracks when you hear a train approaching. 


You would: 

1. see how close you can stand to the train as it passes by. 

2. get off immediately and stay off. 

3. stand a safe distance away from the tracks until the train has passed by. 
4. stay on the tracks until you see the train coming. 


. When you are batter in a softball game, you become angry at the pitcher. You would: 


1. swing at the ball and let the bat fly in the direction of the pitcher. 

2. try to confuse the pitcher by yelling at him. 

3. bunt toward first base and run into the pitcher as he fields the ball. 
4. control your temper and say nothing. 


. A policeman stops you and your friends from playing on a crowded sidewalk. You 


would: 

1. stop playing on the sidewalk. 

2. continue playing on the sidewalk after the policeman leaves. 
3. move to the sidewalk in the next block. 

4. play in the street. 


. You are coasting on your bicycle down a hill at the bottom of which is a four-way 


stop sign. You would: 

1. plan to stop the bicycle at the proper place. 

2. keep looking for cars, and slow down or speed up your coasting to avoid any 
cars crossing the intersection. 

3. give the stop signal and stop the bicycle at the intersection. 

4. run through the stop sign if no cars are coming. 

The softball diamond on which you play has some glass and sharp rocks on it. 

You would: 

. pick up the glass and rocks which are around the bases and home plate. 

play ball without removing the glass or rocks. 

. start the ball game, and pick up the glass and rocks during the game. 

. refuse to play ball until all the glass and rocks have been removed. 


B 
f 


riend suggests that you climb to the fop of a tall tree. You would: 
attempt to climb to the very top. 
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2. dare your friend to go up first. 
3. climb halfway up the tree and stop. 
4. refuse to go up the tree. 
. It is the Fourth of July, and your uncle gives you some firecrackers. You would: 
1. set off a firecracker under a tin can. 
2. throw a lighted fire cracker at your uncle. 
3. ask your uncle to set off the firecrackers for you. 
4. light them and see how long you can hold each one. 
. Your teacher tells you that it is unsafe to play in the shower. You would: 
1. listen to the teacher but continue to play in the shower. 
2. listen to the teacher and stop playing in the shower. 
3. encourage others to stop playing in the shower. 
4. encourage others to play in the shower. 
. A softball is hit over your head during a game on the school playground. When it 
bounces toward a group of smaller children, you would: 
1. push the small children aside as you chase the ball. 
2. walk through the group to get the ball. 
3. stop the game until the children are no longer in the way. 
4. yell at the younger children to watch out. 
. You are in a hurry to get home after school, and a school patrolman tells you to 
stop at an intersection. You would: 
1. tell the patrolman that you have to get home in a hurry. 
2. look right and left for oncoming traffic before dashing across the street. 
3. wait until the school patrolman waves you across the street. 
4. walk to the middle of the block before crossing the street. 
. Your teacher suggests that you stay off railroad bridges because they are dangerous. 
You would: 
l. never walk on railroad bridges. 
2. walk on railroad bridges only when you are sure no trains are coming. 
3. walk on railroad bridges only when no one is watching. 
4. walk on railroad bridges any time you feel like it. 
. While playing softball on a hard surface area, you attempt to steal second base. 
You see that the second baseman has the ball and is waiting to tag you. You would: 
1. try to knock the second baseman down and out of the way. 
2. try to run around the second baseman. 
3. slide under the second baseman. 
4. stop and try to return safely to first base. 
. A policeman tells your group that it is dangerous to play ball in the street. You 
would: 
1. refuse to play in the street. 
2. play in the street, but stop playing when a car approaches. 
3. tell the policeman that no one will get hurt by playing in the street. 
4. wait until the policeman leaves and then start playing in the street again. 
. You are riding your bicycle and stop behind a long line of cars in a traffic jam. 
ee would: 
pass the cars on the right side in order to get up front. 
: cross over to the other side of the street and ride ahead facing traffic from 
opposite direction. 
3. get off your bicycle and push it around the cars to front of the line. 
4. wait for traffic to move before you ride on. 
. A third grader accidentally throws his ball on top of the school building. You would: 
1. try to get the ball for, him. 
2. tell him he can get it himself. 
3. leave the ball alone. 
4. ask the janitor to get the ball. 
(Submitted 12/5/57) 
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Abstract 

The purpose of this study was to compare the effectiveness of ‘a single daily six-second 
exercise bout using two-thirds maximum tension with an exercise program involving 
more frequent exercise bouts at 80 per cent maximum tension. Thirty post-pubescent 
boys divided into two experimental groups and one control group served as subjects. 
Each experimental group was given a four weeks’ training program restricted to isometric 
exercise of the wrist, the programs differing only in regard to the frequency of the exer- 
cise bouts and the levels of static muscular tension employed. The results generally 
supported the findings of Hettinger and Muller, in that brief periods of isometric tension 
(one six-second bout daily at two-thirds maximum tension) proved to be as effective for 
strength development as more frequently repeated exercise bouts at higher levels of 
tension. However, the latter method was found to be somewhat superior in terms of 
strength retention. 


THE STUDIES of DeLorme (3), DeLorme and Watkins (4) and the more 
recent investigations of Hettinger and Mullei (6, 7) have revived interest in 
seeking economical methods for the development of muscular strength. The 
results of these studies indicate that the time required for building static 
muscular strength can be subsiantially reduced over that previously believed 
to be necessary. The effectiveness of DeLorme’s heavy resistance, low repe- 
tition exercise program in strength development has been substantiated by 
Houtz, Parrish and Hellebrandt (9), Hoag (8), and Darcus and Salter (2). 

Whereas DeLorme experimented with isotonic exercise, Hettinger and 
Muller used short periods of static muscular effort with the tension level 
maintained at two-thirds maximum isometric strength. Hettinger and 
Muller (6) reported that one daily exercise bout in which the subject main- 
tained for six seconds two-thirds maximal] tension was as effective in building 
strength as longer and more frequent periods of static exercise. The resulting 
gains in strength were of the order of approximately 5 per cent per week. 
With the termination of the training program, Hettinger and Muller (7) 
found that the loss in strength occurred at about the same rate as the gains 
achieved during training. On the other hand, Clarke, Shay, and Mathews (1) 


333 





334 The Research Quarterly, Vol. 29, No. 3 


reported that static strength of the elbow flexors continued to gain four 
weeks after the end of a four weeks’ program of exhaustive ergographic 
exercises. 

In this country there is contradictory evidence concerning the effectiveness 
of the Hettinger and Muller method of strength development. Wolbers and 
Sills (12) reported that an exercise program in which the muscles were held 
in static contraction for six seconds each day over a period of eight weeks 
produced better than chance gains in strength when used with adolescent 
males. On the other hand, Rasch and Morehouse (11) reported that isotonic 
exercises were more effective in developing isometric strength than were static 
exercises. In fact, Rasch and Morehouse reported insignificant gains in 
strength of elbow flexion following a six weeks’ training program which 
employed a single daily 15-second isometric exercise bout at two-thirds 
maximum tension. It should be kept in mind that Rasch and Morehouse 
provided only three training sessions each week, whereas Hettinger and 
Muller used five training days and one testing day each week. Mathews and 
Kruze (10), in comparing the effects of isometric and isotonic exercises on 
elbow flexor strength, concluded that the isometric-type contraction brought 
about greater gains in strength than did the isotonic-type contraction, even 
though the average exercise time given to the former was only a fraction of 
that devoted to the latter. 

Darcus and Salter (2) in England reported gains in strength resulting 
from either isotonic or isometric exercise, although the greater gains resulted 
from use of dynamic exercise. However, the authors pointed out that the 
static training program involved only momentary isometric contractions, and 
thus, in terms of time of application of effort, the training programs were not 
strictly comparable. 

In view of the contradictory findings reported in the literature, there would 
appear to be need for further study of the effectiveness of brief periods of 
isometric muscular effort in the development of static muscular strength. 


Purpose of the Study 


The purpose of this investigation was two-fold: 1. To test the Hettinger- 
Muller method of developing static muscular strength with post-pubescent 
boys, and 2. To compare the effectiveness of the single six-second bout at 
two-thirds maximum tension with higher levels of isometric tension held 
for progressively longer time periods each day. 


Design of the Study 

The general plan of the study involved the use of two experimental groups 
and one control group upon which periodic observations were made on 
changes in static muscular strength during and subsequent to the training 
program. The two experimental groups were given exercise programs in- 
volving the use of isometric tension which differed both in respect to the 
amount of tension employed and the frequency of the daily exercise bouts. 
The control group engaged in no specialized training program during the 
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period of the investigation. 

The program of exercise and all tests were confined to wrist flexion of the 
right hand. All subjects in the experimental and control groups were given 
tests of wrist flexion strength at the end of the second, fourth, and eighth 
week, so that appropriate comparisons could be made among the groups. 
The immediate and more lasting effects of the two training procedures could 
thus be ascertained. 


Procedures 

The Sample: The subjects included 30 post-pubescent males randomly 
drawn from the 11th and 12th grades at Wisconsin High School, University 
of Wisconsin. Maturity status was determined by utilizing Crampton’s criteria 
for pubic hair and only those boys meeting the criterion of post-pubescence 
were retained in the study group. The mean age of this group was 17 years, 
the mean height 70 inches, and the mean weight 157 pounds. 


Ficure I. Side view of the equipment used for strength 
testing. Subject sat in standard classroom armchair. 


Equipment: The nature of the experiment required a device which would 
provide an accurate record of the static strength developed in wrist flexion, 
confine the action to the prime movers, and provide the means whereby the 
subject and the tester could observe the developed tension at all times. These 
requirements were met by using a standard classroom armchair upon which 
was assembled an arm support and a device for holding a cable tensiometer 
(see Figures I and II). The vertical restraining board, to which the subject’s 
right arm was secured by straps, immobilized the forearm during all testing 
and training operations. 

Training and Testing Procedures: During all training and testing pro- 
cedures, the subject assumed a comfortable sitting position in the chair with 
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the feet solidly placed on the floor, the left arm in the lap, and the right arm 
in position for testing. The ulnar side of the forearm was placed upon the 
arm of the chair with the medial aspect placed firmly against the restraining 
board. The canvas straps were then fastened snugly around the forearm com- 
pletely immobilizing the part. The handle of the tensiometer assembly was 
held in the subject’s hand so that the grasping surface was at the level of 
the proximal row of phalanges. The proper adjustments were made in the 
cable assembly so that the wrist remained in a position of hyperextension 
when maximum force was exerted. To insure standardization during both 
training and testing procedures, the chain length for each subject was re- 
corded and the setting for a particular subject was kept constant throughout 
the experiment. 


Ficure II. Overhead view of testing equipment showing 
tensiometer and the device to immobilize forearm. 


Test Reliabilities: Reliability coeficients were computed for the test results 
at four points in the experiment, namely, the Friday preceding the experi- 
mental period and the Fridays of the second, fourth, and eighth weeks. In 
computing the reliability coefficients at each testing period, the best score 
was correlated with the average of the other two (see Table 1). 


TABLE 1 
Coefficients of Reliability of Test of Wrist Flexor Strength at Each Testing Period 





Test Period N r 


Initial Test _ .950 
Second Week 944 
Fourth Week 973 
Eighth Week 957 
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Equating Procedures: The three groups were equated on the basis of the 
initial strength scores secured on the Friday prior to initiating the experimen- 
tal procedures. The best of three trials was taken as each subject’s strength 
score and was used as the basis for matching subjects into three equated 
groups. 

The mean heights, weights, chronological ages, and initial wrist flexor 
strength scores of the equated groups are shown in Table 2. 


TABLE 2 


Mean Heights, Weights, Chronological Ages, and Initial Wrist Strength Scores of the 
Three Groups of Post-Pubescent Boys 





Strength CO. A. | Height Weight 
N Scores (Ib.) (yrs.) (in.) (1b.) 
cA aan at 10 94.8 16.8 71 160 
ewe 10 93.7 17.1 69 150 
Control . 10 93.6 17.2 70.5 161 




















The exercise program adopted by experimental group E, followed the 
procedures used by Hettinger and Muller (6) in which two-thirds maximum 
tension was held by each subject for only six seconds once each day, Mon- 
days through Thursdays. Each subject in experimental group E, held 80 
per cent of maximum tension for five periods of six seconds each on Mondays, 
increasing the number of exercise bouts once each day with a maximum 


of eight on Thursdays. A ten-second rest period was given between each 
exercise bout. Fridays were devoted to testing to determine the tension levels 
which the experimental groups were to employ the following week. All train- 
ing and testing operations were conducted between the hours of 1:00 p.m. 
and 3:00 p.m. Saturdays and Sundays were free of any activity directly re- 
lated to the study. 


TABLE 3 


Mean Strength Scores and Mean Percentage Increases in Strength for the 
Four Testing Periods 





pres, Percentage Gain or Loss 


‘ Pre-test | Pre-test Week 

Week Week Post- to to Four Pre-test 
Pre-test Two Four test Week Week to 
Group Means Means Means Means Two Four Post-test | Post-test 


94.80 | 105.35 | 107.63 | 96.37 ll 14 = | 2 








93.68 101.30 | 109.22 | 102.40 8 16 — 6 8 
93.56 95.18 90.82 88.66 2 —3 —2 —5 


E, Group using daily single six-second bout at % maximum tension. _ 
Ee Group using daily repeated six-second bouts at 80% maximum tension. 
C Control Group. 


Findings 

In presenting the data in summary form, the mean raw scores of wrist 
flexion strength and the percentage change for each testing period are shown 
in Table 3. It should be noted that Group E,, trained at 80 per cent maxi- 
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mum tension with progressively greater numbers of daily exercise bouts, 
showed somewhat greater absolute and relative gains than were achieved by 
Group E,, which utilized the single six-second bout. Furthermore, there 
appeared to be greater strength retention with Group E, than occurred with 
the group which employed less in the way of intensity and frequency of 
static muscular effort. The rapid loss in strength of Group E,, supports 
Muller’s findings that strength loss, after termination of an exercise program 
occurs at about the same rate at which strength is built. 

To determine whether the observed changes in strength within groups 
might be attributed to some factor other than chance, the t test was applied. 
As may be noted in Table 4, the gains in strength from the beginning to 
the end of the training period were significant beyond the .01 level for both 
experimental groups. Gains in strength which might be attributed to normal 
growth or to chance factors can be ruled out, since no significant changes 
occurred in the control group. 


TABLE 4 


Significance of the Differences in Wrist Flexor Strength for Each Group from 
Beginning to End of Four Weeks’ Training Period 





Mean 8. E. 
N Difference M diff. t 





3.42 
—1,09 


4.54 
2.50 


15.53 
—2.74 


10 
10 


10 | Bs | 2.99 | ‘2 | 





The t test when applied to the losses in strength of each group from the 
termination of the four weeks’ experimental period to the end of the eighth 
week disclosed that the loss in strength for Group E, was significant at 
the .01 level whereas the loss for Group E, was significant at only the 
.05 level (see Table 5). This again points up the fact that strength retention 
following the training period seemed to be more closely associated with the 
program which placed the heavier demands on the muscles. 


TABLE 5 
Significance of Differences in Strength Scores for Each Group from Termination of 
Training to the End of the Fourth Post-Training Week 
| Mean | 8. E. 
N Difference M diff. t P 
10 | —11.26 251 | —4.49 01 








10 — 682 3.22 —2.12 05 
10 — 2.15 3.20 — .67 50 





In order to determine whether the observed differences in strength among 
the three groups at the end of the four weeks’ experimental period might be 
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attributed solely to chance, an analysis of variance was used (see Table 6). 
The technique employed was that recommended by Edwards (5) for use 
with paired or matched subjects. The analysis showed that the difference 
among the groups could not be attributed to chance (P = .01). 


TABLE 6 
Summary of Analysis of Variance of the Strength Scores of the Three Groups 
at the End of the Fourth Week of Training 


Source of Sum of Mean 
Variation Squares | df Squares 


Between columns 2.079.71 2 1,039.85 
Between rows 5,559.15 9 617.68 
Residual (error) 2,637.94 18 146.55 


Total..........| ~—:10,276.80 29 

















To determine the source of variation at the end of the experimental period, 
the t test was employed. As may be noted in Table 7, the differences between 
Group E, and the control group and between Group E, and the control group 
were significant at the .02 and .01 levels, respectively. The difference between 
the strength scores of the two experimental groups was not significant. These 
findings indicate that both the single six-second daily exercise bout using 
two-thirds maximum tension and the method employing more frequent exer- 
cise bouts at 80 per cent maximum tension are effective methods of building 
wrist flexion strength in adolescent males. However, on the basis of the 
data herein included, the one method cannot be judged to be superior to the 
other in terms of immediate strength building results. 


TABLE 7 
Significance of Differences in Strength Scores Between Groups at the End of the 
Four Weeks’ Training Period 
Mean | S. E. | 
Diff. M Diff. 





er 





3.687 
.267 


18.40 
1.58 


4.99 
5.90 


1681 | 5.31 | 3.165 | 02 


80 





In order to determine whether true differences existed in strength among 
the groups four weeks after termination of the exercise programs an analysis 
of variance was again employed utilizing the final test scores. As will be 
noted in Table 8, the difference in strength among the groups was significant 
at the .05 level. 

Although the results obtained from the analysis of variance made further 
statistical treatment questionable, it was decided to apply t tests to deter- 
mine the source of variation among the groups at the time of final testing 
(see Table 9). It will be noted that the difference between Group FE, and 
the control group was significant at the .02 level, whereas the difference 
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TABLE 8 


Summary of Analysis of Variance of the Strength Scores Four Weeks Following the 
Termination of Training 





Source of Sum of Mean 


Variation | Squares af Squares 





Between columns ____ 947.96 2 473.98 
Between rows _._ 5,110.69 9 567.85 
Residual (error) 1,869.95 18 103.88 


7,928.60 29 




















between Group E, and the control was not significant (P = .10). The differ- 
ence between the two experimental groups four weeks after the end of the 
training program was not significant (P = .30). In the light of these find- 
ings, it would appear that greater tension exerted more frequently is some- 
what more effective in maintaining strength once it is developed than is the 
single daily six-second bout at two-thirds maximum tension. 


TABLE 9 


Significance of Differences in Strength Scores Between Groups Four Weeks after 
Termination of Training 





Mean 8. E, 
Group Diff. M diff. 


| 
| 
Ce .. 7.70 4.33 
| 
| 





C-E,___. 13.74 4.45 
E, - E, 6.02 4.88 








Summary and Conclusions 


This study was designed to compare the relative effectiveness of single 
daily isometric exercise bouts maintained at two-thirds maximum tension 
with a program of static exercise in which the frequency of the six-second 
bouts was progressively increased with tension levels at 80 per cent of 
maximum static strength. Thirty post-pubescent boys, divided into two 
experimental groups and one control group, served as subjects. Within 
group and between group comparisons of strength scores were made at the 
conclusion of the four weeks’ period of training and again four weeks after 
the termination of the exercise program. 

The following summarizes the findings of the investigation: 


1. In terms of raw score and percentage gains both experimental groups 
elicited gains during the experimental period. However, the strength increase 
achieved by the group utilizing 80 per cent maximum tension with progres- 
sively greater numbers of daily exercise bouts was slightly greater at the 
end of the training period and the decline less during the post training 
period than for the group employing the daily six-second method. 
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2. The gains achieved by the two experimental groups at the end of the 
training period were significant beyond the .01 level. Loss in strength during 
the four weeks’ post training period was significant for both groups at or 
beyond the .05 level. 

3. In comparing differences among groups, the two experimental groups 
showed significantly higher strength scores than the control group at the end 
of the four weeks’ training period (P = .01 and P = .02). The difference 
between the strength scores of the two experimental groups was not significant. 

4. Differences among the groups four weeks after terminating the special 
exercise programs were less dramatic, although the strength of the group 
using 80 per cent maximum tension was still significantly superior to the 
control group (P = .02). However, there was no significant difference 
between the group employing two-thirds maximum tension and the control 
group. The difference between the two experimental groups in strength 
retention, while not significant (P — .30), favored the group employing 
the higher tension level for longer periods of time. 

The findings of this study generally supported the Hettinger-Muller 
hypothesis of static strength development. While the data indicated that 
tension levels greater than two-thirds maximum with more frequent exercise 
bouts were not superior to the single daily six-second bout in building iso- 
metric strength, the former method tended to be slightly more effective in 
terms of developing qualities of strength retention. 
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Development and Validation of 
an Objective Measure of Locomotor 
Response to Auditory Rhythmic Stimuli’ 


SHIRLEY E. SIMPSON? 
Boston Public Schools 
Boston, Massachusetts 


Abstract 


This study indicates that an objective measure was developed and validated which 
permits measurement of locomotor response to auditory rhythmic stimuli. This instru- 
ment is called a “Rhythmeter.” Results of the study indicate that when women from 
the general college population were compared with trained amateur and professional 
dancers, the scores achieved by the dancers were statistically superior. A comparison 
of scores received on the “Rhythmeter” with those made on a written sensory test indi- 
cates a very low correlation between these factors. It was found that sensory and motor 
responses are not similar either within individuals, or among groups. 


THIS STUDY was selected as a result of a need which has long been felt in 
dance and in physical education. Tests of rhythm have been made over the 
years, but none to date have objectively measured locomotor response to 
rhythmic stimuli. As dance is a locomotor activity, it is essential that the 
measurement of the element of rhythm be locomotor. 

A review of the literature indicated that rhythm is an individual, kines- 
thetic, instinctive experience. Rhythm is kinesthetic in that it must be felt 
by the individual, and it is instinctive in that all individuals are able to 
feel and respond (but not always accurately) to rhythmic stimulation. Sea- 
shore (10), Ruckmick (9), Flagg (2), Hayes (5), Redfield (8), and others, 
expressed this belief and projected the feeling that individual response to 
rhythmic stimulus is needed. 

Certain studies in the field of rhythm have been made. These can be classi- 
fied under several categories such as motor tests, sensory tests, and certain 
studies made which involve rhythm or rhythmic response. 

Probably the most pertinent segment of literature is that in which the need 
for further study is emphasized. The writers indicated a dearth of evi- 
dence on objective measurement of locomotor response to rhythm and recom- 
mended that much further study and investigation be done in these areas. 


1 This study was made in partial fulfillment of requirements for the degree of Doctor 
of Education at Boston University School of Education, 1957, under the direction of 
Dr. Arthur G. Miller. 


2Home address of author is 17 East Milton Rd., Brookline 46, Massachusetts. 
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to Auditory Rhythmic Stimuli 


Techniques and Procedures 


The Apparatus. The instrument constructed for this study to measure the 
locomotor response of individuals to auditory rhythmic stimuli was made up 
of several parts and is called a “Rhythmeter” (see Figure I). 


Ficure I. The “Rhythmeter,” showing kymograph 

in background. In foreground are the two panels 

which are the point of contact for the subject 
and the starting platform (r.). 


The point of contact for the subject is a pair of panels, placed in parallel 
position, and attached firmly to the rear plane of the instrument. Each is 12 
inches long, six inches wide, and painted red on the front four inches. Under 
the front section of each panel is a two-inch compression-type spring with a 
displacement of 350 pounds per inch. Directly in front of the spring is an 
electric button switch. As the panels are depressed, a contact on the under 
surface connects with the switch, thus closing the electrical circuit. The 
switches are wired in parallel circuit and are connected with an electro- 
magnet which has a pen-holding arm in contact with the kymograph. The 
electricity is supplied from the regular 110 voltage alternating house current, 
and a chime transformer mounted on the instrument reduces the voltage to 
10 volts. 

A platform, called the starting box, is so constructed that it fits around 
the panel section for carrying purposes. While in use, the platform is hooked 
onto the base of the instrument for stability. It is on the starting box that 
the individual stands to listen to the instructions, and to which she returns 
following her response to each item. 

The kymograph is an electrically driven rotating drum which operates on 
115 volt, 60 cycle A.C. current. A pen arm attached to the electromagnet 
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holds a ballpoint cartridge firmly against the drum by means of a small 
compression spring. 

The paper on which the pen writes is adding-machine tape, fed to the 

kymograph from a spindle. During the course of the presentation, the 
kymograph was positioned behind the subject so that it could not be seen 
by her. 
Pilot Study. A pilot study was conducted using the “Rhythmeter” on high- 
school junior class girls. This was to determine whether or not the instru- 
ment measured individual performances, and to correlate the factors of 
height and weight, age, and I. Q. 

The auditory rhythmic stimuli consisted of 46 stimulus-response patterns, 
performed by drum beat and recorded on magnetic tape. 

An analysis of internal consistency indicated that 24 items were statistically 
significant, thus causing the rejection of the remaining 22 items. 

The study indicated that individual differences exist in locomotor response 
to auditory rhythmic stimuli. 

Evidence indicated no statistically significant relationship between “Rhyth- 
meter” score and age, between score and I. Q., or between score and height 
and weight. 

It was felt that some of the patterns required the subject merely to 

memorize or respond intellectually, and thus might not measure true rhythmic 
response. For that reason, continuing rhythm patterns were used on the 
major study. This provided sufficient repetition for the subject to feel the 
pattern as would occur in dancing, and then replicate it by locomotor 
response. 
The Measure. Fifteen rhythmic patterns, found by the pilot study to be 
statistically significant, were selected for use in the measure. The patterns 
included basic walking or running beats, and skips, gallops, rumba and polka 
rhythms. Rhythms of varying note values and tempo in 2/4, 3/4, 4/4 were 
included. The note values of these steps were prepared and formed into 
measures for use as auditory stimuli, and were played by a professional 
pianist and recorded on magnetic tape. 

The first half of the test was comprised of 15 stimulus-response items in 
which the pattern was presented once and which the individual was required 
to repeat. In the second case, the same items were presented in varying 
order as continuing rhythm. In continuing rhythm, the pattern was presented 
and repeated several times. The individual was invited to join the piano 
rhythm when ready. ; 

The measure was easy to administer, brief, and scored by the use of a 
plasi:c mask. The subject’s response was considered to be correct if within 
one mm. of the master mask. One correct response was sufficient for each 
item. The range of difficulty was such that no individual of those tested 
made a perfect score, yet none failed to make at least one perfect response. 
A one percent level of confidence was established as the level at which the 
date would be statistically significant. 








to Auditory Rhythmic Stimuli 


TABLE 1 


Distribution of Scores and Reliability Coefficients for the Study Groups on 
the Instrument to Measure Locomotor Response (Rhythmeter) 





Range Hoyt’s 
(30 Formula | Split-Half 
Group items) . Dp. (r) Correlation 





Control Group 
General college 
population 8.1 
Experimental Group 
Dance Club members 20.3 
Professional dancers ___. ; 21.7 
Combined Experimental 
ROUND Be a 20.9 























TABLE 2 
Distribution of Scores on Sensory Test and Coefficient of Correlation Between Instrument 
to Measure Locomotor Response (Rhythmeter) and Rhythm Identification 
Section of Kwalwasser-Dykema Music Tests 





Mean 
Scores Range 
of 


on 
Rhythm Scores 
Group Ident. (25 items) 8. D. 





Control Group 
General college population 17.74 12-24 18 081 
Experimental Group 
Dance Club members 21.9 15-23 1.12 56 
Professional dancers 21.54 15-23 1.79 59 
CRO or 21.31 15-23 1.51 56 




















Commercial Sensory Tests Incorporated in Study. Objective measurement of 
sensory response to rhythm has been possible for many years. As the 
relationship between sensory and locomotor response had not been estab- 
lished, the time discrimination and rhythm identification sections of the 
Kwalwasser-Dykema Music Tests (7) were administered. No evidence was 
found that the ability to score successfully on a sensory test indicated that 
the subject would perform as well as on a locomotor response measure. 
Populations Studied. Two experimental and one control group were used as 
subjects for the study. The control group consisted of 89 members of the gen- 
eral college population. The experimental groups included 38 professional 
dancers and 42 members of the dance clubs in professional schools or colleges 
of physical education. All of the subjects were women. 

In order to determine a relationship between sensory and locomotor re- 
sponse, the rhythmic identification and time discrimination sections of the 
Kwalwasser-Dykema Music Tests (7) were administered. These paper and 
pencil tests, of 25 items each, required the subject to write her response to 
the rhythmic patterns presented by record. Scoring was by means of a 
matrix. 
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TABLE 3 


Distribution of Scores on Sensory Test and Coefficient of Correlation Between Instrument 
to Measure Locomotor Response (Rhythmeter) and Time Discrimination 
Section of the Kwalwasser-Dykema Music Tests 


Mean 
Scores 
on 
Time 
Group Disc. 








Control Group 
General college population 15.17 
Experimental Group 
Dance Club members 16.58 
Professional dancers 14.64 
Combined 15.36 




















TABLE 4 


Significance of Differences Between Means on Study Groups on Performance on 
Instrument to Measure Locomotor Response (Rhythmeter) 





Groups Level of | Hypothesis 
Compared Mean 8. D. - Confidence ‘ested 





General college students __ 8.1 6.5 12.6 P1% Reject 
Deatstae tT. ee 4.5 
General college —.__ 8.1 6.5 P1% 


Professional dancers ___ ‘ 4.4 Reject 


Dance Club 20.3 45 

Professional dancers _____. 2 44 : PT} Upto 

Control Group..___ 8.1 6.5 

Combined Experimental \. P1% Reject 
Gelet4 Oe Rie 4.4 























C = Control Group 
E = Experimental Group 


Statistical Procedures. Comparisons of the scores of the three groups were 
made. Hoyt’s Formula (6) using the analysis of variance technique was 
used to determine reliability. Further evidence of reliability was established 
by determining the coefficient of correlation between the two halves of the 
test. Validity was determined by testing the significance of the differences 
among the means of the study groups. The null hypothesis held that no 
statistically significant differences exist among the means of the study groups. _ 
Face validity exists in that the auditory rhythmic stimuli are similar to those 
used in dance, actual locomotor response is required, and the response auto- 
matically recorded. 

The Pearson product-moment correlation coefficient was used to compare 
the scores achieved on the “Rhythmeter” with those of the rhythmic identifi- 
cation and time discrimination sections of the Kwalwasser-Dykema Music 
Tests (7). 

Internal consistency was determined by the Phi coefficient method. 





Locomotor Response to Auditory Rhythmic Stimuli 


Results 


The effectiveness of the instrument to measure locomotor response (Rhyth- 
meter) was determined statistically. Reliability as determined by Hoyt’s 
Formula (6) involving the analysis of variance was significant at the 1 per 
cent level. Correlation between stimulus-response and continuing rhythm 
items was found to be very high, with a Pearson r of .89. 

The differences in means of control and experimental groups are significant 
at below the one percent level of confidence. The differencs in means within 
the experimental group are not statistically significant. 

The test of significance for the difference between the means of the sensory 
tests was negative. No statistically significant differences existed among the 
populations studied on sensory test performance. 

The coefficient of correlation between scores obtained by participants on 
the “Rhythmeter” and the sensory tests were found to be low. 

The Phi coefficient method was used to determine internal consistency. 
All but one item were found to be satistically significant at the 1 per cent 
level of confidence. 


Conclusions 


An instrument to measure objectively differences in locomotor response 
to auditory rhythmic stimuli has been developed and found to be reliable. 


All subjects for the study were women. 

Within the experimental study group, no significant differences were found 
on the “Rhythmeter.” 

Scores made by control and experimental groups indicated that individual 
and group differences existed in locomotor response to auditory rhythmic 
stimuli, to a statistically significant degree. 

Although the experimental groups had a higher coefficient of correlation 
than the control group, no statistical evidence was found to indicate signifi- 
cant differences between the means of control and experimental groups for 
the Kwalwasser-Dykema Music Tests (7). 

No significant relationship exists between the “Rhythmeter” and the sensory 
tests within the total study group. 

Results of this study indicate that performance on the “Rhythmeter” may 
serve as an indication of rhythmic ability, which is a basic need in the field 
of the dance. 


Recommendations for Further Study 


1. Norms should be established for the measurement of locomotor re- 
sponse to auditory rhythmic stimuli. 
2. A comparison study should be made of differences in results among 


males and females. 
3. Scores made on the “Rhythmeter” should be compared with athletic 


skill. 
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4. A study should be made of locomotor response to other stimuli, as light 
or touch. 

5. The response to auditory rhythmic stimuli by locomotor activity 
(Rhythmeter) and by motor (hand) reaction should be compared. 

6. A study should be made of group results on the “Rhythmeter” with 
an experimental group to experience concentrated practice in rhythms and 
dance, as compared with a non-practicing control group. 

7. A growth study over an extended span of time should be made to 
determine the relationship of scores of individuals at one age, and the same 
subject’s scores when older. 
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Relationships of Extreme Body 
Types to Ranges of Flexibility: 


HERMAN J. TYRANCE 
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Washington, D. C. 


Abstract 


Data were collected from 105 of the fattest, thinnest, and most muscular students at 
Pennsylvania State University to investigate the relationship of these extreme body 
types to ganges of flexibility. Anatomical and statistical procedures were used to discover 
the association. X-rays failed to indicate any significant cause for extreme ranges of 
flexibility. Statistical analysis demonstrated concurrence among important variables to a 
signficant degree. 


BODY SIZE is determined by the measurement of such characteristics as 
height, weight, muscle development, adipose tissue, and skeletal or bony 
structure. If the body size of an individual is out of proportion or extreme, 
he may be handicapped in many ways. The restriction of motor performance 
may be due to the varying degrees of body size or to the physical structures 
which have a direct bearing on his motor ability. Motor ability performance 
depends upon many interrelated factors, among which is joint mobility or 
flexibility. Variations in the position of muscular attachments, the type and 
quality of the muscles themselves, the length of the bony levers, and the 
structures at the joint (ligaments, cartilage, and tendons)—all influence this 
range of motion. 

Many studies have indicated the relationship of body size to physical and 
mental achievement. Some research has been done which illustrates the rela- 
tionship of flexibility to motor performance. This study of the relationship 
of extreme body types to ranges of flexibility was undertaken at Pennsylvania 
State University in 1953 to determine whether some predictions could be 
made about flexibility in terms of known body size. 


Procedure 


Thirty-five of the thinnest, fattest, and most muscular students at Pennsyl- 
vania State University were selected for the study. All the subjects were 
Caucasian and between the ages of 18 and 22. The students were judged in 
terms of degree of occurrence of characteristics for fatness (endomorphy), 
muscularity (mesomorphy), and thinness (ectomorphy). 


1 This study was made in partial fulfillment of the requirements for the degree of 
Doctor of Philosophy at Pennsylvania State University, University Park, Pennsylvania. 


349 





The Research Quarterly, Vol. 29, No. 3 














Proat Pose Lateral Pose Rear Pose 


Each subject was weighed and measured in underwear and socks. After 
the age, height, weight, class, and the type of physical activity engaged in 
by each student were recorded, he was instructed to stand upon a revolving 
platform. The platform was rotated so that front, side, and rear views could 


be taken. A black and white grid with horizontal diameters and perpen- 
dicular planes were placed behind it to be used later as a guide for photo- 
grammetric purposes. Each student was clad in an athletic supporter. 

Some of the phases of the techniques of Sheldon (5) and Dupertuis and 
Tanner (1) was used. The relaxed pose, except for the stiffly-extendsd arms, 
was resorted to because correction of the individual’s postural pattern tended 
to produce awkwardness. Figure I shows the points of photographic measure- 
ment found on the three poses of the students. On the front pose, three 
measurements were taken: 1. Facial Breadth-one (FB,), the lateral measure- 
ment of the face taken at the highest level of the junction of the pinna of the 
ear with the skin line of the head; 2. Facial-Breadth-two (FB,), the lateral 
measurement taken at the lowest level of the junction of the lobe of the ear 
with the skin line; and 3. Neck-Thickness-Transverse (NTT), the shortest 
lateral diameter of the neck. 

The second pose showing the depth of the body from the left side view 
exhibits nine measurements: 1. Neck-Thickness-anteroposterior (NT,.,), the 
shortest front-to-back diameter of the neck; 2. Trunk-Thickness-one (TT,), 
the horizontal front-to-back diameter of the trunk taken a point midway 
between the level of the center of the nipple and the most anteriorly pro- 
jecting point of the sternoclavicular junction; 3. Trunk-Thickness-two (TT,), 
the minimum horizontal front-to-back diameter taken at the level of the 
waistline; 4. Trunk-Thickness-three (TT,), the horizontal front-to-back 
diameter taken at the level of a point on the body surface directly over the 
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symphysis pubis; 5. Arm-Thickness-Upper (ATU), the front-to-back arm 
diameter taken at the level of the midpoint between the photographic center 
of the cubital fossa and a point on the skin overlying the greater tuberosity 
of the humerus and lying immediately beneath the anterior tip of the 
acromion process; 6, Arm-Thickness-Lower-one (ATL,), the forearm diam- 
eter taken at the level of greatest thickness below the elbow in a plane per- 
pendicular to the axis of the forearm; 7. Arm-Thickness-Lower-two (ATL,), 
the photographic diameter taken in a plane perpendicular to the axis of the 
forearm at a predesignated distance above the styloid process of the radius; 
8. Leg-Thickness-Upper-one (LTU,), the front-to-back diameter of the leg 
taken at the level of the center of the angle formed by the subgluteal fold; 
and 9. Leg-Thickness-Upper-two (LTU,), the horizontal front-to-back diam- 
eter of the leg taken at the level of the center of the slight fossa or hollow 
seen immediately above the patella. 

The five breadth measurements taken from the rear pose are: 1. Trunk- 
Breadth-one (TB,), the lateral diameter between the uppermost visible points 
in the lines formed by the posterior axillary folds; 2. Trunk-Breadth-two 
(TB,), the minimum transverse diameter taken at the narrowest level of the 
waist; 3. Trunk-Breadth-three (TB,), the maximum horizontal transverse 
diameter taken at the widest level of the hips; 4. Leg-Thickness-Lower-one 
(LTL,), the maximum transverse calf diameter taken at the level of the 
greatest thickness of the calf of the gastrocnemius muscle in a plane per- 
pendicular to the axis of the lower left leg; and 5. Leg-Thickness-Lower-two 
(LTL,), the minimum transverse left ankle diameter taken at the narrowest 
point in the ankle, not necessarily in a plane at right angles to the axis of 
the leg. 

Panatomic-X film was used with the exposure time set for 1/25 second and 
the lens stopped down to f4.5. The lighting features were checked by instruc- 
tors in the Visual Aids Film Program of Pennsylvania State University. The 
printing and developing of the film were also under their direction. Measure- 
ments were taken from the pictures and transformed into somatotypes, using 
the tables of Sheldon (5). These somatotypes were similar in arrangement 
of components and through anthroposcopy the dissimilar ones were elimi- 
nated. The ones having the highest rank were judged to be the true somato- 
types. The estimates and diameter measurements were taken on each body 
region. The mean type of the five body regions for each subject was judged 
to be his somatotype. 

Each student remained in his shorts and socks for al] the flexibility meas- 
urements except the ankle inversion-eversion measurement, for which shoes 
are required. The measuring instrument, the flexometer, is a metal disc 
graded in a double row of degrees. Both rows have a common zero mark 
with bottom and top rows running left and right to 360°. Superimposed on 
the face is the indicator with a weighted balance at one end and a needle 
at the other. When the instrument had been locked to a joint by a strap, 
extreme movement was made in one direction and a lock designated this 
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point as zero. Extreme movement was made at the joint in the opposite 
direction and at this point was recorded the full amplitude of angular move- 
ment at the joint in degrees. There were 19 measurements taken of eight 
joints. None were taken of finger and toe flexibility. The positions in which 
the measurements were made are the following: 


l. Chair sitting position. Neck lateral fiexion-extension, wrist vertical and wrist 
lateral flexion, and wrist rotation measurements were taken of each subject seated in 
an armchair. 

2. Table position. Ankle flexion-extension, ankle inversion-eversion, and hip rotation 
measurements were taken of the subjects seated on the table. Elbow flexion-extension 
measurements were taken from a squat position, with the arm diagonally placed across 
the corner of the table. Trunk rotation, neck rotation, and neck flexion-extension 
measurements were taken from a supine position on the table. The only prone position 
assumed on the table was for the measurement of knee flexion-extension. 

3. Standing position. This position was assumed to determine amplitude of hip 
flexion-extension, trunk lateral flexion-extension, shoulder flexion-extension, shoulder 
abduction-adduction, and shoulder rotation. All measurements were directed and 
approved by the inventor of the instrument, Dr. Jack Leighton (3), who at that time 
was a member of the faculty at Pennsylvania State University. The objectivity of the 
flexometer had already been established at between .89 and .99. 


When the flexibility range of a subject was found to be 20° or more 
above or below the mean range of the entire group for that joint movement, 
X-rays were taken to determine, if possible, the anatomical cause of the 
excessive or restricted range. All subjects exhibiting these extreme ranges 


of flexibility and not having a record of previous injury were X-rayed at the 
Pennsylvania State University Health Service. An official diagnosis was 
issued by the attending medical officer qualified to read and interpret the 
X-rays. 


Statistical Procedure 


The following techniques were used to determine the significance of the 
differences in ranges of flexibility and to investigate the relationship of the 
body types to ranges of joint mobility: 

1. The use of t values to determine the significance of the difference 
between the group means of the three body types. 

2. The use of zero-order correlations to determine the association of 
flexibility variables. 

3. The use of zero-order correlations to determine the association of 
somatotype variables. 

4. The use of zero-order correlations to determine the association of 
somatotype variables with flexibility variables. 

5. The use of multiple correlations to show the predictive values of 
flexibility variables when associated with somatotype criteria. 

6. The use of chi-square and the contingency coefficients to determine the 
possible extent of influence of similar flexibility traits in different body types. 

The data consisted of scores made by students on the range of movement 
at the eight joints of the body and the measurements which identify body 
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types. Once these differences were established and their statistical significance 
found, the data were inspected to discover how the flexibility measures and the 
somatotype variables were intercorrelated. 

The flexibility variables which exhibited the various predictive values when 
associated with somatotype criteria were analyzed to determine what degree 
of chance was present in the selection of the sample. Effort was made to find 
the anatomical cause or causes which could account for the range of flexi- 
bility among the extreme body types. 


Findings 
Thirty-one flexibility variables of each of the extreme body types were 
compared to discover what differences existed. Statistically the differences 


TABLE 1 
t Values for Significance of the Difference Between the Group Means 
of Extreme Body Types 
(in degrees of flexibility) 





Flexibili 
Lar Endomorphic Group Mesomorphic Group Mean 


t 
(N =70) Mean | 8D Mean | sD Difference} Values* 


Neck flexion 112.43 12.14 135.63 17.71 23.20 
Neck lateral flexion 105.14 12.23 115.74 14.78 8.60 
Neck rotation 142.63 16.71 159.57 14.85 16.94 
Elbow flexion 144.07 8.13 145.20 7.78 1.13 
Knee flexion 132.20 9.11 136.92 6.22 4.72 
Hip abduction 45.31 6.56 49.09 6.16 3.78 
Hip extension 106.14 13.66 117.29 16.91 11.17 











Flexibility Ectomorphic Group Mesomorphic Group Mean 
Variables Mean ‘| SD Mean | SD Difference 


Neck flexion 139.80 12.26 135.63 17.71 4.17 1.10 
Neck lateral flexion | 116.94 15.93 115.74 14.78 1.20 324 
Neck rotation 155.62 14.94 159.57 14.85 —3.95 —1.09 
Elbow flexion 148.27 6.59 145.20 7.78 3.07 1.77 
Knee flexion 141.56 7.11 136.92 6.22 4.64 2.78 
Hip abduction 49.70 6.60 49.09 6.16 61 396 
Hip extension 113.20 15.73 117.29 16.91 —4,09 —1.52 























Negative value indicates difference in favor of the Mesomorphic Group 





Flexibility Endomorphic Group Ectomorphic Group Mean t 
Variables Mean | SD Mean | SD Difference; Values* 


Neck flexion 112.43 12.14 139.80 12.26 27.47 9.27 
Neck lateral flexion 106.14 12.23 116.94 15.93 10.80 4.42 
Neck rotation 142.63 16.71 155.62 14.94 12.99 3.39 
Elbow flexion 144.07 8.13 148.27 6.59 4.20 2.30 
Knee flexion 132.20 9.11 141.56 7.61 9.36 4.60 
Hip abduction 45.31 6.56 49.70 6.60 4.31 2.69 
Hip extension 106.14 13.66 116.06 15.73 9.92 3.31 























* A t of 2.58 is significant at the 1% level and of 1.96 at the 5% level of confidence. 
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TABLE 2 
Zero-Order Intercorrelations of Flexibility Variables 





Variables (N=105) NF NLF NR EF HE 


6434 5658 1197 4015 
Neck lateral flexion _ 4585 1917 2031 
Neck rotation —..__ .1407 2963 
.0406 





Hip extension ___ 
Hip abduction _____. 
Knee flexion ____ 


























TABLE 3 
Zero-Order Intercorrelations of Somatotype Variables 





Variables Ecto- Meso- Endo- 

(N=105) Height Weight | morphy | morphy | morphy 
NE ii ada .2369 | —.0653 -1320 1018 
Weight saad aid —.8419 0309 7070 
Ectomorphy —___ —.4183 | —.5559 
Mesomorphy —___. —.4706 
Endomorphy ______. 
Neck-Thickness- 
antero-posterior ___ 
Trunk-Breadth-one __ 





























between the ranges of flexibility of the three body types were significant when 
the means were compared. The t values showed that significant differences 
of the means were present when elbow, wrist, and knee movements of thin 
types were compared with muscular types. These differences were significant 
at the 5 per cent level of confidence. Both ectomorphic and mesomorphic 
types showed significant mean differences in neck flexion, neck lateral flexion, 
neck rotation, elbow flexion, knee flexion, trunk flexion, and hip extension, 
when compared with the endomorphic types. All of these differences were 
either at the 1 or 5 per cent levels of confidence. Table 1 shows the seven 
flexibility variables exhibiting the highest t values. Where right and left 
sides of the body were taken, an average of those joint measurements served 
as the joint score. These seven most significant variables were the only ones 
used in the computations of this study. 

In order to discover the degree of correlation present among the seven 
selected flexibility variables, zero-order correlations were used. Table 2 shows 
the association of these measures. Neck flexion showed the highest correlation 
with other flexibility variables, and hip abduction and elbow flexion showed 
the lowest. 

Since measurements of the two lateral types—the endomorph and the 
mesomorph—would show wide differences when compared with the thin or 
ectomorphic type, greatest attention was given to a comparison between the 
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TABLE 6 


Distribution of Flexibility Scores According to Somatotype Showing Observed 
and Theoretical Frequencies 





Somatotype Scores in Relation to Neck Flexion Levels* 





Mesomorphy 


Ectomorphy 


Endomorphy 


Totals? 





1 ( 66) 
29 (23 ) 
5 (113 ) 


1 ( .66) 
32 (23) 
2 (11.3 ) 


0 ( .66) 
8 (23 ) 
27 (11.3 ) 


2 
69 
34 





Somatotype Scores in Relation to Neck Lateral Flexion Levels 





2 <7 2 
19 (17.3 ) 
7 (10.66) 


mae. 
16 (17.3 5 
9 (10.66: 


re ae 
17 (17.3 ) 
16 (10.66) 


21 
52 
32 





Somatotype Scores in Relation to Neck Rotation 





10 (7 ) 
21 (18.66) 
4 ( 9.33) 


Os 7) 
23 (18.66) 
4 ( 9.33) 


a 44°) 
12 (18.66) 
20 ( 9.33) 


21 
56 
28 








Somatotype Scor 


es in Relation to Elbow Flexion Levels 





Medium __. 


Low 


5 ( 5.66) 
23 (22.66) 
7 ( 5.66) 


6 ( 5.66) 
27 (22.66) 
2 ( 5.66) 


7 ( 5.66) 
20 (22.66) 
8 ( 5.66) 





Somatotype Scores in Relation to Hip 


Extension Levels 





B. C9) 
16 (18.66) 


mw <2 
24 (18.66) 





6 ( 7.33) . 


1 ( 7.33) 


4(9 ) 
16 (18.66) 
15 ( 7.33) 





Somatotype Sceres in Relation to Hip 


Abduction Levels 





5 ( 433) 
28 (27.66) 
2(3 ) 


8 ( 4.33) 
27 (27.66) 
0(3 ) 


0 ( 4.33) 
28 (27.66) 
Je ie 


16 
72 
17 





Medium __ 





Somatotype Sco 


res in Relation to Knee Flexion Levels 








Low 


10 (10.66) 
23 (21.66) 


19 (10.66) 
15 (21.66) 





2 ( 7.66) 


1 ( 7.66) 





3 (10.66) 
27 (21.66) 
5 ( 7.66) 





32 
65 
8 








1The sum of the levels always equals 35. 
*The sum of the totals equals 105. 


two large types. The only significant differences between them were found 
in Trunk-Breadth-one (TB,), the distance between the uppermost visible 
points of the posterior axillary folds, and Neck-Thickness-antero-posterior 
(NT,-p), the shortest front-to-back measurement of the neck. When all the 
somatotype measures were compared, NT,-., showed high correlations with 
weight and TB,; ectomorphy showed high negative correlations with all other 
variables. Table 3 shows the association of somatotype variables with each 


other. 


To determine whether there was an association between the variables of 
flexibility and somatotype zero-order correlations were used. Table 4 shows 
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the intercorrelation of flexibility and somatotype variables. Neck-Thickness- 
antero-posterior exhibited high negative correlation with hip flexion. The 
highest positive correlaton occurred when mesomorphy and neck flexion were 
associated. 

Table 5 illustrates the relationship of flexibility measures to composite 
somatotype measures as indicated by regression coefficients of flexibility meas- 
ures and the multiple correlations. The composite criterion for each somato- 
type variable was found by standardizing all these scores by the McCall’s 
T formula. The relative beta weights for each somatotype measure were 
computed by the Doolittle technique and were useful in finding the multiple 
correlations. The highest multiple R as determined from the beta weights 
was exhibited by NT,.,. 

The flexibility scores according to somatotype indicated were expressed 
in actual and theoretical frequencies. Table 6 shows the distribution of 
flexibility scores according to somatotype showing observed and theoretical 
frequencies. Table 7 illustrates the chi-square and contingency coefficients 
for the flexibility measures when distributed according to extreme body types. 
Neck flexion, neck rotation, and hip abduction showed a probability of 
acceptance of this sample. 

The amount of motion at a joint as stated by Goldthwaite (2) is dependent 
upon such factors as the type and shape of the joint surfaces, the freedom 
or inelasticity of the ligaments, and the condition of the protective muscles. 
X-rays of the extreme cases of flexibility showed varying distances between 
bone protuberances and sockets, but professional X-ray interpreters could 
not assure that the proximity or the surfaces of the contiguous bones were the 
cause of excessive or restricted flexibility in the extreme body types. 


Conclusions 
Conclusions drawn from these findings are as follows: 


1. The anatomical difference in range of flexibility at the joints could not 
be found through the use of the X-ray. 

2. The only significant differences between the two lateral types—endo- 
morphic and mesomorphic—were in Neck-Thickness-antero-posterior (NT,-»), 
the front-to-back measurement of the neck, and Trunk-Breadth-one (TB, ), the 
widest transverse measurement of the back. 

3. The only statistically significant flexibility measurements between the 
three extreme body types were neck flexibility (neck flexion, neck rotation, 
and neck lateral flexion; hip flexibility (hip abduction and hip extension) ; 
and knee (flexion) and elbow (flexion) flexibility. 

4. Neck flexion exhibited the highest correlations with other measures of 
flexibility and somatotype and can be said to be ihe most significant measure 
of flexibility when extreme body types are compared. 

5. When significant flexibility and body type variables are compared, the 
association indicates that as the mesomorphic component increases, neck 
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flexibility also increases and that as Neck-Thickness-antero-posterior in- 
creases, the flexibility of the hip in range of flexion also increases. 

6. Neck-Thickness-antero-posterior is the most significant somatotype vari- 
able in the prediction of flexibility. This significance indicated that as one’s 
neck size of front-to-rear increases, one’s. range of hip flexion decreases sig- 
nificantly and his general flexibility decreases gradually. 

7. The difference between observed and theoretical ranges of flexibility 
were significant enough to indicate that the probability of the sample occur- 
ring only once in 100 umes would not be due to chance, but to a significant 
relationship. 


REFERENCES 


1, Dupertuts, C. W., and J. H. Tanner. The pose of the subject for photogrammetric 
anthropometry with special reference to somatotyping, AMERICAN JOURNAL OF 
PuysicaL ANTHROPOLOGY, 7: 27-48, March 1950. 

2. Gotptuwairte, J. E., L. T. Brow, L. T. Swain, and G. D. Kuuns. Body Mechanics. 
Philadelphia: Lippincott, 1937. 293 pp. 

3. Letenton, Jack R. A simple objective and reliable measure of flexibility, Research 
Quarterly, 13: 205-16, 1942. 

4. Suetpon, W. H., S. S. Stevens, and W. B. Tucker. The Varieties of Human Physique. 
New York: Harper and Bros., 1940. 

5. Sttts, Frank. A factor analysis of somatotypes and of their relationship to achieve- 
ment in motor skills, Research Quarterly, 21: 424-37, 1950. 


(Submitted 5/31/57) 





Relationships of Lateral Dominance 
to Scores of Motor Ability and 
Selected Skill Tests: 


EUNICE E. WAY 


Smith College 
Northampton, Massachusetts 


Abstract 


The study reports an investigation of the incidence of the variods laterality preferences 
among college women, and the relationships of lateral dominance to general motor ability 
and to skills test scores in archery, badminton, bowling, and tennis. The results indicate 
that the majority of college women have definite lateral preferences; that women who 
have mixed eye, hand, and foot dominance are superior in motor ability to those who 
have homolateral or contralateral preference; that laterality seems to be of importance 
in activities stressing accuracy of direction toward a fixed target. 


LATERALITY has been of interest for centuries. This interest has prompted 
many theories as to the cause of lateral dominance. The investigator classified 
these theories into three groups: anatomical, hereditary, and social. The 
anatomical group included theories which stated that the causative factor was 
inequality of blood supply, cerebral development, ocular dominance, or 
visceral distribution. The hereditary character of laterality was studied by 
Jordan (5) and Ramaley (11). Both men concluded that handedness fol- 
lowed the Mendelian law of inheritance. Early education, habit, and primi- 
tive warfare as determinants of laterality were included among the social 
theories. 

There is wide variation in the incidence of laterality reported in the 
literature. The dissimilar results may be due to the fact that data may not be 
comparable, as ages, tests, situations, and criteria differ among the various 
studies. Some of the variation may also be due to the difficulties of definition. 
Tests measuring preference for use, order of response, strength, dexterity, and 
steadiness have been used to measure dominance. 

The literature appears to indicate that there is no unitary trait that can be 
called dominance. Jasper and Raney (3) stated three possibilities of asym- 
metric function in the eyes: dominance in motor control, dominance in the 
receptor mechanism, and dominance in the central projection areas. Heinlein 
(6) demonstrated the same thing in relation to handedness. Her study showed 
differences in dominance in relation to tests of muscular strength, steadiness 
of motor control, small muscle control, and precision in large muscle 
activity. 


1 Portions of this study were done in partial fulfillment of the requirements for the 
degree of Doctor of Philosophy at the University of Washington, Seattle. 
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Little has been written of the relationship of laterality to the performance 
of physical education skills. In the few studies available, most investigators 
have studied the ability to perform skills in relation to lateral dominance as 
measured by tests of small muscle involvement. Heinlein, however, found a 
difference in dominance when measured by tests of small muscle control and 
tests of precision in large muscle activity; therefore the writer believed that 
an investigation using tests which involved the use of large muscles was neces- 
sary. 

Purpose 

The purpose of this study was to investigate the incidence of the various 
laterality preferences among college women; the relationship of lateral 
dominance with general motor ability; and the relationships of laterality 
with skills test scores in archery, badminton, bowling, and tennis. 


Subjects 

The study was limited to 410 freshmen and sophomore women enrolled in 
the required program of physical education at the University of Washington. 
All subjects were between the ages of 17 and 25 and had been examined by 
a physician and found free of abnormalities. All subjects were enrolled in 
one of the beginning sections of archery, badminton, bowling, or tennis. 


Instruments of Measurement 
MOTOR ABILITY 

The Scott Motor Ability Test (13) was given to determine the competency 
of each subject in sports skills. This indication of motor aptitude was used 
because it has served as an adequate measure of ability to perform large 
muscle activities. 


EYE DOMINANCE 

The Miles A-B-C Test of Ocular Dominance (9) was chosen to determine 
the unconscious eye preference of each subject as this test reportedly is not 
influenced by handedness (2). The eye dominance test was scored by sub- 
tracting the left eye choice from the total number of choices. 


HAND DEXTERITY 

A modification of the Johnson dart board test (4, 15) was used to deter- 
mine the degree of dexterity of each hand. The modified test had predicted 
reliabilities? of .56 for the left hand and .73 for the right hand. The preferred 
hand was determined by subtracting left hand score from right hand score. 


FOOT DEXTERITY 
To determine the degree of dexterity of each foot, the footedness test by 
Turner (14) was administered. The predicted reliabilities of the test as 


: Predicted reliabilities are based on odd-even correlations raised by the Spearman- 
Brown formula. Except for the skill test in archery, these reliabilities were computed 
by the writer. 
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determined in this study were .92 for the left foot and .85 for the right foot. 
The preferred foot was determined by subtracting the score for the left foot 
from the score for the right foot. 


SKILL TEST SCORES 

Skill test scores in the physical education activities were obtained in dif- 
ferent ways. For archery and bowling, scores were recorded by the students 
and checked by the instructors daily throughout the quarter. Skill in archery 
was measured by the sum of the six best ends multiplied by the total number 
of hits (8). The predicted reliability of the score at the 30-yard distance was 
.912; of the score at the 40-yard distance, .915. A score was computed for 
both distances, then these were added. 

Bowling skill was measured by the sum of the six best games bowled 
during the quarter. The predicted reliability was .88. 

Scores indicating the degree of skill in badminton and tennis were obtained 
by specific skills tests given at the end of the quarter. The Miller wall volley 
test (10) was used in badminton and the Broer-Miller forehand and backhand 
drive tests (1) were used to determine skill in tennis. 


Procedure 

Selected volunteers from the students majoring in the School of Physical 
and Health Education, Department for Women, administered the laterality 
and motor ability tests. All were given written and oral instructions pertain- 
ing to the administration of the test with which they were to work. 

Tests of motor ability and lateral preference were given during a three- 
week period of time. Two class periods were required for each section with 
the exception of the archery class and one small tennis section where one 
class period was sufficient. There was no definite order for completion of the 
tests. Students were asked to go to the testing station where the fewest were 
waiting. Test scores were recorded by the examiners on score cards which 
the subjects carried with them to each testing station. 

In the activities in which a final skill test was given (badminton and 
tennis), the test was given at the end of the quarter by the instructor of each 
section. Reports of the scores on the final skills tests were given to the 
investigator. In archery and bowling, a record of all scores made during 
the quarter was given to the investigator who recorded the best scores for 
each subject on her score card. 


Treatment of Data 
Frequency distributions and histograms of the scores on eye dominance, 
hand dexterity, and foot dexterity were made. These distributions and histo- 
grams and the reports of the previous studies of eyedness, handedness, and 
footedness were inspected. Scores were established arbitrarily to define the 
limits of each preference as follows: 
Eyedness: Scores of 0 — 3, left eye preference 
4 — 6, ambiocular 
7 — 10, right eye preference 
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Handedness and footedness: Scores of + 2 and over, right preference 
+ 1, ambidexterity 
— 2 and over, left preference 


Laterality groups were compared to determine the significance of the dif- 
ference between the mean scores for the general motor ability test. For each 
variable respectively (eye, hand, and foot), comparisons of motor ability 
were made among subjects who had right and left preferences, right and 
ambidextrous preferences, and left and ambidextrous preferences. 

Combinations of two of the variables were classified as pure and mixed 
dominance. Subjects with pure dominance were those who had the same 
preference in both variables. The comparisons were made between the vari- 
ous eye-hand preferences, the eye-foot preferences, and the hand-foot pref- 
erences. 

Subjects having the same preference for three variables were termed homo- 
laterals. Those subjects who had the same preference for two variables and 
a different preference for the third were called contralaterals. There were a 
few subjects who had a different preference for each variable (one right, 
one left, and one ambidextrous). These were classified as having mixed 
preference. Comparisons were made between these groups. 

All differences between the mean scores of motor ability were tested by 
the t-ratio to determine the significance of the difference. 


In order to eliminate the influences of differences in general ability and 
background in physical activities when comparing the skill of the laterality 
groups in various physical education activities, the groups were equated on 
the basis of motor ability test scores. Owing to pairing in terms of motor 
ability and to incomplete data for some individuals, the number of cases 
was materially reduced in this section. The numbers of pairs of cases in 
archery ranged from 3 to 8; in badminton, from 10 to 32; in bowling from 
5 to 52; and in tennis, from 3 to 19. Equated laterality groups were com- 
pared by means of the t-ratio to determine the significance of the differences 
between the mean scores of skill in each activity. 


Findings 
INCIDENCE OF LATERALITY PREFERENCES 


Eye dominance. Findings in relation to eye preference indicated a definite 
dominance in 96 per cent of 410 cases, with 61.9 per cent preferring to use 
the right eye. Figure I shows the distribution of the scores. The portion of 
the distribution representing each eye preference is indicated. 

Hand dominance. Scores of hand preference (Figure II) were distributed 
normally with the mean score of 13.6 well to the right of the point where 
the right hand score was equal to the left hand score (0). This may indicate 
the influence of a predominantly right-handed culture although it cannot be 
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Ficure I. Distribution of Eye Preference Scores. 
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Ficure II. Distribution of Hand Preference Scores. 
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Ficure III. Distribution of Foot Preference Scores. 


proved by this study. The standard deviation of the distribution was 11.25. 
A definite hand dominance was indicated by 95.7 per cent of 400 cases, of 
which 84 per cent preferred to use the right hand. Four and two-tenths per 
cent of the subjects had no apparent preference. 


Foot dominance. Foot preference scores (Figure III) approximated the 
normal curve with a mean of 0.7 and a standard deviation of 4.25. Only 47.3 
per cent of the total group (398 cases) preferred the use of the right foot in 
dexterity movements; 29.4 per cent had no definite preference; and 23 per 
cent preferred the use of the left foot. The large number of subjects with 
no apparent preference for the use of one foot was of interest. Footedness 
was the only variable investigated in which ambidexterity was found to be 
more common than was sinistrality. 

Eye-hand preference. Pure eye-hand preference was found in 57.0 per cent 
of 397 cases, with 53 per cent of these demonstrating a preference for the 
use of the right members. Only 0.5 per cent (two subjects) had no apparent 
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preference for either variable. Of the groups of subjects demonstrating 
mixed preference, the largest group (28.4%) preferred the use of the left 
eye and the right hand. 

Eye-foot preference. Only 36.5 per cent of 392 subjects indicated pure eye- 
foot relationships. Of these, 27.4 per cent preferred the use of the right 
members. Among the subjects demonstrating mixed preference, 20.3 per 
cent preferred the use of the right eye and either foot, while 18.3 per cent 
preferred to use the left eye and the right foot and 14 per cent preferred 
the right eye and the left foot. It should be noted that in eye-foot preference 
there are three mixed preference groups which are large, whereas in the eye- 
hand preference only one mixed preference group was large. 


Hand-foot preference. Pure hand-foot preference was indicated in 46.3 per 
cent of 391 cases, with 41.6 per cent preferring the right members. Of the 
subjects who indicated mixed preference, 24.8 per cent preferred the use 
of the right hand but had no apparent preference in footedness. The second 
most prevalent combination was that of the right hand and the left foot. 
These two groups account for approximately 80 per cent of the subjects with 
mixed hand-foot preference. 

Eye-hand-foot preference. The majority of 390 subjects (62.5%) demon- 
strated contralateral preference (same preference for two variables and a dif- 
ferent preference for the other). Of these subjects, 50.3 per cent had a non- 


corresponding foot and 32.2 per cent had a non-corresponding eye. Only 26.4 
per cent of the total cases indicated homolateral preference (preference for 
use of the same eye, hand, and foot). A small percentage of the cases (11%) 
showed mixed preference (preference for one right member, one left member, 
and no apparent preference for the third variable). 


RELATIONSHIP OF LATERAL DOMINANCE TO GENERAL MOTOR ABILITY 


Single variable. The differences between the means of the laterality groups 
selected on the basis of eye dominance or of hand dominance were not 
significant. The mean score of the ambidextral footedness group was signifi- 
cantly higher than the mean score of the group preferring the right foot 
(+1% level of confidence) and higher than the mean score of the group 
preferring the left foot (5% level of confidence). 

Eye-hand preference. There were no significant differences between motor 
ability means of the laterality groups determined by eye-hand preference. 
Hand-foot preference. In general, the mean motor ability scores of those 
groups having mixed hand-foot dominance seemed to be higher than the 
mean motor ability scores of the groups having pure dominance. Exceptions 
to this were those groups having combinations of right and left dominance. 
The mean score of the group having right hand-left foot dominance was 
lower than the mean scores of the groups having pure dominance. The mean 
score of the group having left hand-right foot dominance was only slightly 
higher than the mean score of the group with pure right dominance and 
equal to the mean score of the pure left dominance group. 
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The group with right hand dominance and ambidextrous feet had a higher 
degree of motor ability than those with either pure right dominance or right 
hand-left foot dominance (10% level of confidence). The group having the 
combination of left hand dominance and ambidextrous feet tended to have 
a higher degree of motor ability than the groups who had pure right domi- 
nance or right hand-left foot dominance (20% level of confidence). 
Eye-foot preference. Persons having left eye dominance and ambidextrous 
feet appeared to have higher motor ability scores than any other eye-foot 
preference group. This group (left eye dominance and ambidextrous feet) 
had a mean score which was: 1. significantly higher than the mean score of 
the group preferring the right eye and left foot (2% level of confidence) ; 
2. higher than the mean score of the group who had pure right dominance 
(5% level of confidence) ; 3. higher than the mean score of the groups which 
were ambiocular and right footed or left eyed and right footed (10% level 
of confidence) ; and 4. higher than the group who had pure left dominance 
(20% level of confidence). 

The right eyed group with no apparent preference of foot had a higher 

mean score than did the right eyed-left footed group (10% level of confi- 
dence), the pure right dominance group (20% level of confidence), and the 
ambiocular-right footed group (20% level of confidence). 
Eye-hand-foot preference. Those women having mixed dominance of three 
variables had higher motor ability scores than did those women with homo- 
lateral or contralateral preference. The mean motor ability score of the 
mixed dominance group was superior to the mean score of the homolaterals 
(5% level of confidence) and was higher than the mean motor ability score 
of the contralaterals (10% level of confidence). The difference between the 
mean motor ability scores of the homolaterals and of the contralaterals was 
not significant, however, the contralaterals tended to have higher scores. 

Among the mixed dominance groups, no combination seemed to be superior 
to another. Among the various contralateral groups there tended to be some 
superiority. The mean score of the women with a non-corresponding right 
eye was higher than the mean score of those with a non-corresponding right 
hand (20% level of confidence). The mean motor ability score of the women 
with non-corresponding ambidextrous feet was higher than the mean score 
of those with a non-corresponding left foot (20% level of confidence). 


RELATIONSHIP OF LATERALITY TO SKILL TEST SCORES 


In relating laterality to skill test scores in an activity, it was interesting 
to note that laterality seemed to be of more importance in the activities which 
stress accuracy of direction toward a fixed target (archery and bowling). 
Comparisons of nine sets of data yielded differences between the means of 
the scores which were significant at the 20 per cent level of confidence or 
better. Of these nine comparisons, four were in archery and three were in 
bowling. 

Archery. In archery, the mean skill test score of the group preferring the 
left eye was higher than the mean score of the group preferring to use the 
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right eye (20 per cent level of confidence). Those women with ambidextrous 
feet had higher scores than did those with a dominant right foot (10% level 
of confidence). The groups with mixed eye-foot dominance and with contra- 
lateral preference had higher mean scores than did those who demonstrated 
pute eye-foot dominance or homolateral preference (20% level of confi- 
dence). The unexpected relationship of foot ambidexterity to skill in archery 
might possibly indicate the importance of stability. Could ambidexterity 
mean that the pressure of the two feet on the ground would be equal, thus 
insuring a balanced position from which to shoot? It should be noted that 
the small number of subjects made it difficult to draw conclusions. However, 
the results obtained in archery are consistent, except in the case of handed- 
ness, in that the group of subjects preferring to use the right member or 
those groups indicating pure dominance had lower mean scores. 

Bowling. Bowling scores indicated greater skill for the right-eyed subjects 
than for the left-eyed subjects (20% level of confidence). Those women with 
ambidextrous hands had a higher mean score of success than did those who 
were left-handed (10% level of confidence). The group with pure hand-foot 
dominance had a higher mean score than did the group with mixed hand-foot 
dominance (20% level of confidence). 

Badminton. Only in badminton was a significance ratio high enough to war- 
rant confidence in the results. Subjects who preferred the use of the left 
eye had higher skill test scores than those who preferred the right eye (2 per 
cent level of confidence). Although it seemed logical to expect some signifi- 
cant difference between the means of the eye-hand preference groups, the data 
indicated only chance relationships. 

Tennis. In only one comparison in tennis was the difference significant at 
the 20 per cent level of confidence. In this comparison, the group of subjects 
who were ambidextrous had a higher mean score than did the group pre- 
ferring to use the right hand. 


The relationships revealed in badminton and tennis were expected to be 
similar because of the similarity of the two activities. This was not the case, 
particularly with regard to eyedness. The writer believed that the difference 
in results may have been due to the characteristics of the tests involved. The 
badminton test involved striking the shuttlecock after it had rebounded from 
the wall whereas the tennis test did not require receiving and returning the 
ball. It is believed that the repeated striking of a moving object involves 
the use of the eyes much more than does a single stroke as was used in the 
test of skill in tennis. 


Conclusions 


Within the limitations of this study, the following conclusions seem justi- 
fied: 

1. The majority of college women have definite eye, hand, and foot 
preferences. 
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2. Motor ability seems to be related to foot ambidexterity, since in all of 
the relationships showing significant differences the combinations including 
foot ambidexterity resulted in higher motor ability scores. 

3. Women who have mixed eye, hand, and foot dominance are superior in 
motor ability to those who have homolateral or contralateral preference. 

4, Eye dominance and lateral dexterity seem to have some relationship to 
skill in archery, badminton, bowling, and tennis. 

5. Laterality seems to be of more importance in the activities stressing 
accuracy of direction toward a fixed target (archery and bowling) than in 
activities which do not. 

There is no indication that homolateral preference results in a higher 
degree of skill than does contralateral preference. 
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49. Carns, Marie L., and Ruta Guassow. Changes in body volume accompanying 
weight reduction in college women. Human Biology, 29: 4: 305 (Dec. 1957). 

Ten undergraduate University women volunteered to go on a reducing diet for 6 
weeks. The percentage loss in volume was nearly twice that of the percentage loss of 
weight. While there was an over-all loss in volume, the area of greatest loss differed 
among the girls but the losses were proportionately higher where fat deposits were larg- 
est. After dieting, the body proportions showed a decrease in individual differences in 
lower extremities and in shoulder and upper extremities with an increase in individual 
differences in the trunk.—D. B. Van Dalen. 


50. Corvitte, Frances M. The learning of motor skills as influenced by knowledge of 
‘ mechanical principles. J. of Educ. Psych., 48: 6: 321 (Oct. 1957). 

Three parallel experiments—ball rolling, catching, and archery—constituted the in- 
vestigation. There ic no evidence that instruction concerning mechanical principles 
utilized in the performance of a motor skill facilitates. the initial learning of the skill 
to any great extent than an equivalent amount of time spent in practicing the skill; 
also, there is no evidence that such knowledge facilitates subsequent learning as evi- 
denced in the performance of a similar or more complicated skill to which the same 
principle is applicable-—D. B. Van Dalen. 


51. Dam, C. W., and J. E. Arritept. Effect of body position on respiratory muscle func- 
tions. Arch. Physic. Med. and Rehab., 38: 427-434 (July 1957). 

Respiratory volumes in horizontal and erect positions were recorded in normal and 
paralytic subjects. Change from the recumbent to erect position caused an increase in 
the expiratory reserve volume and decrease in the respiratory reserve volume, thus caus- 
ing no change in vital capacity. Relationship between respiratory volumes and the ease 
of breathing depends on the type and extent of paralysis. Sanborn basal metabolism 
apparatus and tilt table were used.—Peter V. Karpovich. 


52. Fermerart, T. H., H. M. Fernstarr, E. A. Fercuson, Jr., A. H. Price, J. E. HEALEY, 
Jr., and W. Atuison. Clinical and blood chemical studies with Ascriptin, with 
particular reference to headaches and arthritic pains. New York State J. Med., 
58: 697 (March 1958). 

As compared with aspirin, Ascriptin (aspirin buffered with magnesium aluminum 
hydroxide) produced a 160% higher blood salicylate level in 10 minutes, 250% higher 
in 20 minutes, 264% higher in 30 minutes, 134% higher in one hour, and 60% higher 
in 24 hours. As compared with aspirin buffered with a mixture of aluminum glycinate 
and magnesium carbonate, the Ascriptin blood salicylate levels were also higher at 
the above time intervals. In a clinical comparison, a single dose of Ascriptin relieved 
headaches in an average of 9 minutes as compared with 16 minutes for aspirin. Ascrip- 
tin relieved arthritic pains in an average of 17 minutes; aspirin, in 25 minutes. Aspirin 
caused gastric irritation in 20% of cases. All of these patients took Ascriptin without 
discomfort, as did others with duodenal ulcer known to be sensitive to aspirin. Both 
Ascriptin and aspirin proved satisfactory for relief of headaches, arthritic pains and 
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joint immobility. In the clinical comparison, Ascriptin gave faster or greater relief to 
a larger percentage of patients for all three of these symptoms.—H. M. Feinblatt. 


53. Imic, C. J., H. Gaskmtt, Apetta Bauer, and H. M. Hines. Measurement of periph- 
eral blood flow under condition of physiologic stress. Arch. Physic. Med. and 
Rehab., 38: 571-573 (Sept. 1957). 

Venous occlusion plethysmography technique for measuring the volume of blood flow 
is of greater diagnostic value when used after a standard exercise. During condition of 
rest, adequate circulation may prevail in the presence of definite vascular pathology. 
—Peter V. Karpovich. 


54. Kimmet, Hersert D. Three criteria for the use of one-tailed tests. Psychological 
Bulletin, 54: 351-353 (July 1957). 

Different opinions have been reported in the literature about when a one-tailed test 
of significance should be used. Actually, the argument is not one of mathematical sta- 
tistics but an argument of experimental logic. The disagreement is not over using one- 
tailed tests to test one-tailed hypotheses. Rather, the disagreement is over when one- 
tailed hypotheses should be made. Since no set of standards can prescribe beforehand 
either the logical or ethical decisions of individual scientists, three temporary criteria 
for the use of one-tailed tests are: 

1. When a difference in the opposite direction to that predicted is meaningless. Such 
a situation would be found when a trained skilled task experimental group is compared 
with an untrained control group. 

2. When results in the opposite direction to the one predicted are not to be used to 
decide upon a course of action that is different from the course of action if no difference 
is found. Such a situation might occur when one attempts to see if a new product is 
better than the one now on the market. 

3. When in psychological theory results in the opposite direction are not deducible. 
If an opposite direction is explainable in existing theory, the hypotheses must be stated 
such that opposite results can be evaluated.—Frances Z. Cumbee. 


55. Korrxe, E. J., Jean Danz, and W. G. Kusicex. Study of cardiac output during 
rehabilitation activities. Arch. Physic. Med. and Rehab., 38: 72-82 (Feb. 1957). 
Subjects were normal young women in fasting state. Cardiac output was measured 
by Grollman’s method and the 0, consumption by either closed or open circuit. Cardiac 
output increase expressed in per cent of cardia: output in recumbent position were: 
reclining at 45°—8, getting out and in bed—4, leather stamping—1l1, chip carving—15, 
leather tooling—16, printing press—60, floor loom weaving—70, bicycle—138.—Peter V. 
Karpovich. 


56. Myers, ALonzo, editor. Planning for Retirement. Journal of Educational Sociology. 
31: 281-328 (April 1958). 

The above title is the general topic for the entire issue which consists of six related 
articles. One article reports a questionnaire study of 107 retired faculty and staff mem- 
bers from New York University. Another article deals with a panel discussion of the 
health and medical aspects of retirement. Other article titles are “University Retire- 
ment Policies,” “Placement Programs for Retiring or Retired Professors,” “Financial 
Planning for Retirement,” and “Essential Elements of a Good Retirement Plan.”—Bruce 
L. Bennett. 


57. Procror, R. C., W. H. Battey, and W. G. Morenouse. An analeptic tranquilizer 
for senile psychoses, report of clinical and pharmacological studies of Nicozol 

with Reserpine. J. Am. Geriatrics Soc., 6: 291 (April 1958). 
Clinical and pharmacological studies demonstrated that Nicozol with Reserpine (pen- 
tylenetetrazol 100 mg., niacin 50 mg., and reserpine 0.25 mg.) provides a safe and highly 
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effective treatment for senile psychoses. This medication combines the analeptic and 
vasodilator actions of Nicozol with the tranquilizing effect of Reserpine. With this ther- 
apy many patients who otherwise would have required institutional care were managed 
at home with a minimum of nursing attention. 

In a series of 75 cases of senile psychoses treated with Nicozol with Reserpine, 65 
(87%) showed improvement. The therapy afforded relief of agitation and restlessness 
together with improved memory, behavior, sociability, appearance and tidiness. Symp- 
toms of confusion, aggressiveness, hostility and disorientation were relieved. The only 
side-effect was transitory flushing of the skin in two cases. There were no convulsions 
in any case. Pharmacological studies on mice indicated that reserpine does not poten- 
tiate orally induced pentylenetetrazol convulsive seizures.—D. B. Van Dalen. 


58. Swartz, Jacos, Hersert I. Posin, and ApraHAM Kaye. Psychiatric problems in 
an urban university. Mental Hygiene, 42: 224-228 (April 1958). 

The authors, who are psychiatrists at Boston University, report on their psychiatric 
student patients for a four-year period from 1952 to 1956. Seventy-eight percent of 
these students presented emotional problems consistent with long-standing emotional 
difficulties. Only 11 percent suffered from transient situational disorders. 

The authors also found considerable difficulty in trying to assist students of good or 
superior intelligence who were doing work far below their capacity. The major part 
of the time and effort of the psychiatric staff was devoted to psychotherapy, the purpose 
of which was to help the student meet his immediate problems and current life situation. 
—Bruce L. Bennett. 


59. Trevarrow, Vircinta E. Longitudinal study of plasma fibrinogen in children. 
Human Biology, 29: 354 (Dec. 1957). 

Plasma fibrinogen was determined in 88 children who were followed over a period 
of several years. The level of fibrinogen changed with age. Individuals differ from each 
other in the level of fibrinogen, in their variability, and possibly in their fibrinogen 
response to infection. These differences show no correlation with frequency or duration 
of infections, sibling position in the family, or protein intake. The level of fibrinogen 
does not tend to be similar with the children of some families. Environmental factors 
do not correlate with fibrinogen level—D. B. Van Dalen. 


60. Waxtm, K. G., and F. H. Krusen. Comparison of effects of electric stimulation 
with effects of intermittent compression on the work output and endurance of 
denervated muscle. Arch. Phys. Med. and Rehab., 38: 21-23 (Jan. 1957). 

Rats with one leg paralyzed by denervation were used. These legs were stimulated 
every day either electrically or by intermittent compression. After 25-30 days, animals 
were anesthetized and work output of normal and paralyzed muscles was determined. 
Work capacity after electrical stimulation was 31.9-59.2%, after intermittent compres- 
sion—14.9% and unstimulated 15.6% of normal. Beneficial effect of electric stimulation 
is not, therefore, due to intermittent compression and decompression through which mus- 
cles go during contraction—Peter V. Karpovich. 
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Notes & Comments 


COMMENTS 


Comments on the article by Donald B. Swegan, Gene T. Yankosky, and James 
Williams, lll, “Effect of Repetition Upon Speed of Preferred-Arm Extension” in the 
March 1958 Research Quarterly. 


The attention of the writers has been directed to several errors which inadvertently 
occurred in the above article. 

The reference cited for Joseph E. Hipple, was incorrect. The study referred to was 
found in the Research Quarterly, 26: 246-247, and entitled “Warm-up and Fatigue in 
Junior High School Sprints.” 

Hipple’s study and one done by Betty A. Pacheco (“Improvement in Jumping Per- 
formance Due to Preliminary Exercise,” Research Quarterly, 28: 55-63, March 1957) 
both deal with repetition as a form of warm-up and should be mentioned in this category 
in the introductory material. 

Table 4, page 81, should read as follows: Critical Ratios for the Comparison of Each 
Group of Five Trials with Each Successive Group of Five Trials for 11 Subjects 
Executing 50 Repetitions of the Preferred-Arm Extension Movement (Phase II Only). 

Appreciation is expressed to Dr. Franklin M. Henry, University of California, 
Berkeley, for his constructive criticisms of this article—Donald B. Swegan, Pennsylvania 
State University. 











Guide to Authors 


IN LINE WITH the over-all goal of making Association publications yield 
the greatest value to the individual and the profession, the following is a 
guide for the preparation of manuscripts for the Research Quarterly, recog- 
nizing general techniques employed by research publications. 


Article Manuscripts 


Manuscripts should be sent to the Editor (AAHPER, 1201 Sixteenth Street, 
Northwest, Washington 6, D. C.), who will see that each one is read by at 
least three members of the Research Quarterly Board of Associate Editors. 
On the basis of the three reviews, the Editor will advise the author as to the 
suitability of the paper or the desirability for revision. Papers are not judged 
by arbitrary standards but on their content of new research results in the 
field of physical education, health education, and recreation, presented with 
the greatest brevity compatible with scientific accuracy and clarity (see Octo- 
ber 1951 Quarterly, p. 392-94). 

Since three members of the Board of Associate Editors review an article, 
it is requested that three clear copies of the manuscript be submitted in order 
to facilitate reviewing. A fourth copy of the article should be retained by 
the author. 

Typewritten manuscript should be double spaced on white paper of ordi- 
nary weight and standard size (81x11 inches). A brief abstract of the 
article, 100 words or less, should be typed double space on a separate sheet 
(see abstracts at head of Quarterly articles for style). 

The sheets of manuscript should be kept flat and fastened with clips which 
can be removed easily The pages of the typewritten copy should be numbered 
consecutively in the upper right-hand corner. Paragraphs should be num- 
bered consecutively throughout the manuscript. 


Notes and Comments 


Notes on minor research and on apparatus, objective critical comments, 
and summaries of status surveys will be printed in the Notes and Comments 
section. Simple status surveys are no longer acceptable as regular Quarterly 
articles, by decision of the Research Council. Such studies will be published 
in brief form (300-500 words) under Notes and Comments. 


Headings 

The article should be arranged so as to indicate relative values of heading 
and subheadings. 

Usually four gradations are sufficient: (a) article title, (b) first subhead 
appearing in boldface aligned left on page (underscored in manuscript with 
wavy line), (c) second subhead (only if necessary) appearing in capital 
letters aligned left on page, (d) third subhead, to appear in italic (under- 
scored in manuscript), not centered, but run in at the beginning of the para- 
graph or section. 
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All headings should be typed in lower case with initial capitals, except for 


(c) above. 


Documentation 
FOOTNOTES 


Footnotes are not to be used for references or literature citations. They are 
rather used for the purpose of acknowledgment, special explanation, supple- 
mentary information, etc., as in the examples below. 

Type footnotes (if any) on separate sheets, as many footnotes as conveni- 
ent being written on a sheet. Footnotes should be numbered from 1 up for 
each article, a corresponding numeral appearing in the text. Asterisks should 
not be used. 


Examples of Footnotes: 

1 This study was made under the direction of Dr. Arthur T. Slater-Hammel in the Re- 
search Laboratories, School of Health, Physical Education, and Recreation, Indiana Uni- 
versity, Bloomington, Indiana. 

2 All measurements of the hand were recorded in centimeters and height was recorded 
in inches. The hand measurements were taken by Everett and reliability coefficients of 
above .90 were found for each measurement used in the study. 

3 For their wholehearted co-operation in facilitating collection of the data, special grati- 
tude is extended to Superintendent Clarence Hines and the 1950-51 principals of the 
Adams, Condon, Edison, Francis Willare, Harris, Howard, Lincoln, River Road, and 
Whiteaker schools. 


CITATIONS OF LITERATURE 

Citations of literature should be segregated alphabetically by author’s last 
name at the end of each article, under the heading of “REFERENCES.” Do 
not treat them as footnotes. 

The literature citations, listed alphabetically, should be numbered consecu- 
tively, their location in the text being indicated by corresponding numbers 
enclosed in parentheses: for example, (1) (2, 3). If there are several refer- 
ences in the text to a citation, the specific pages may be indicated thus: 
(1, p. 117), (1, p. 162-63). 

A uniform style should be maintained in writing citations. Enclose titles 
of chapters and articles in quotation marks. Italicize (underscore in manu- 
script) names of books, periodicals, bulletins, etc. (see examples below). 

Uniform sequence of data should be observed, as follows: For a book— 
Author’s name (last name first) ; title of article or chapter; name of book; 
place of publication; publisher; year date. For a periodical—Author’s name 
(last name first) ; title of article or chapter; name of periodical; volume num- 
ber; inclusive page numbers; month and year date. 

Examples of References: 

1. American Association for Health, Physical Education, and Recreation. “Suggested 
Platforms for Health Education.” Journal of the American Association for Health- 
Physical Education-Recreation 18: 436; Sept. 1947. 

2. American Association of School Administrators. Health in Schools. Washington, 
D. C.; the Association, a department of the National Education Association, 1951. 
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3. Deaver, G. G. “Exercise and Heart Disease.” Research Quarterly 26: 24-34; Oct. 
1955. 

4. Ogden, Jean, and White, Jess. Small Communities in Action. New York: Harper & 
Brothers, 1956. 

5. Potter, John Nicholas. Physical Fitness of Junior High School Boys. Unpublished 
Master’s thesis. Berkeley: University of California, 1952. 


Tabular Matter 

Each table should have a descriptive heading and should be specifically 
referred to in the text by number, e.g., “Table 10,” never as “the above 
table” or “the following table.” Number tables from 1 up for the entire manu- 
script, using Arabic numerals. Do not duplicate data by giving it in both 
tables and graphs. 

Tables should be typewritten double space, like the rest of the material in 
the manuscript. They should be typed on separate sheets, as the printer will 
set them on a different machine from the one used for the text matter. If a 
table continue on a second sheet, it is not necessary to repeat the boxheads, 
since the printer will repeat from the original boxheads, when necessary. 

The word “TABLE” should be written in capital letters, as: “TABLE 1”; 
the table title should also be written in capitals, centered over the table. 
Tables should be ruled as desired, except that no rules will appear at the 
extreme right and left edges of the table. No double rules are to be used, 
unless necessary for clarity. 

Well-known statistical formulas should be omitted. Extensive tabular ma- 


terial, raw data, and appendixes should not be printed; the author can men- 
tion in a footnote that he will supply such material in mimeographed form 
on request. 


Illustrations 


Illustrative material is of two types: pen and ink drawings, which are re- 
produced by the line engraving process; and photographs, wash drawings, 
stipple drawings (in short, anything containing shading), which are repro- 
duced by the halftone process. All illustrative material (considered as fig- 
ures) should be numbered consecutively from I up for the entire manuscript. 
Use Roman numerals to number figures. 

Line engravings are treated as figures and should be numbered as Figure 
I, II, etc. All drawings should be made with India ink, preferably on white 
bristol board plate, 1 ply or 2 ply, which is sufficiently transparent to permit 
tracing if back lighting (e.g., a window pane) is used. Avoid graph paper 
for the reproduction copy, as the printing interferes with proper inking and 
the paper permits no corrections. Sometimes it is desirable to ink in the prin- 
cipal guide lines so that the curves can be more easily read. Good examples 
of graphs can be seen in the Research Quarterly for October 1953, p. 332 
and 366. 

Lettering should be plain and large enough to reproduce well when the 
drawing is reduced to the dimensions of the printed page (44 x7 inches). 
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Most figures can be advantageously drawn for a linear reduction of one-half 
or one-third. Be sure to draw the lines heavy enough so that they will not be 
overly thin after reduction. Explanatory lettering should be included within 
the chart. Typewritten lettering does not reproduce well; it is much better to 
use a LeRoy or similar lettering device. 

Care should be taken not to waste space, as this means greater reduction 
and a less satisfactory illustration. Often it is possible to combine several 
curves in one figure and enable the reader to make comparisons. 

Halftones are treated as figures and should be so numbered. Frequently, 
several halftones can be grouped to form an attractive full page, in which 
case they should be numbered consecutively, in Roman numerals. Photo- 
graphs should be in the form of clear black-and-white prints on glossy paper. 
Care should be taken to see that they cannot be bent or folded in handling 
and paper clips should not be used. All imperfections are reproduced. 

The legends for the illustrations should be typed upon a separate sheet 
placed at the end of the manuscript. Care should be taken to indicate plainly 
in the text the exact location of all illustrations and tables. 


Special Points of Style 
USE OF NUMBERS 


Use Arabic figures for all definite weights, measurements, percentages, and 
degrees of temperature (for example: 2 kgm., 1 inch, 20.5 cc., 300° C.). 


For numerals used in a general sense, spell out numbers through ten and use 
Arabic figures for 11 and over (seven times, five years old, 11 students). 


ABBREVIATIONS AND SYMBOLS 

Standard abbreviations should be used whenever the weights and measure- 
ments are used with figures, e.g., 10 kg., 6.25 cc., etc. The forms to be used 
(for both singular and plural) are: ft., ft.-lb., ft./sec., in., yd., min., hr., sq. 
ft., sq. in. Gram should be spelled out in all cases to avoid possible confusion 
with grain; also spell out mile. All obscure and ambiguous abbreviations 
should be avoided. Symbols used should follow the notation listed in Research 
Methods (AAHPER), p. 518-20, 522-25. Most common are: 


= mean r = Pearson correlation 
Mdn = median bis = biserial correction 
N = number of individuals "I = reliability coefficient 
n = number of measurements x? = chi square 
o = standard deviation F = variance ratio 
oM = standard error of mean t = Student (Fisher) t ratio 


Percent should be one word. Use percent sign (%) in tables or when it 
appears in parentheses in text. 


Proofreading 


The author will receive his original manuscript and any engraver’s proofs 
with the galley proofs of his article for correction. A reprint order blank will 
be enclosed for the author’s convenience; it should be returned whether or 
not reprints are wanted. 

Corrected proofs and original manuscripts are to be returned 
within 48 hours after receipt by first-class mail to the Editor, 
AAHPER, 1201 Sixteenth Street, Northwest, Washington 6, D. C. 





RECENT 
AAHPER 
PUBLICATIONS 


FIT TO TEACH 
First definitive book on the teacher’s health. 


This book explores the major problems and gives resources 
for solving them. It outlines personal, administrative, organ- 
izational, and community responsibilities. 


1957 Cloth. 250 pages. $3.50 


CASTING AND ANGLING 


The most comprehensive single book on the subject. 


This latest book in the Outdoor Education series was pre- 
pared by experts under the chairmanship of Julian W. Smith, 
Director of AAHPER’s Outdoor Education Project. 


1958 Illus. 52 pages. $2.00 


HEALTHFUL SCHOOL LIVING 
A comprehensive guide to a healthful ‘ironment. 


This is third in a series of reports of the |. nivtee 
on Health Problems in Education of the National _\): «tion 
Association and the American Medical Association. Charles 
C. Wilson, M.D., Yale University, was the editor. 


1957 Illus. 400 pages. $5.00 





Order From 


AAHPER, 1201 - 16th St., N. W., Washington 6, D. C. 





AAHPER Conference Reports 


Recreation for the Mentally Ill 
Nov. 17-20, 1957 
1958 $2.00 
Professional Preparation Education for 
of Recreation Personnel Leisure 
Nov. 14-16, 1956 May 15-18, 1957 
1957 $1.00 1958 $1.00 





National Conference for City Directors 
of Health, Physical Education, and Recreation, Dec. 9-13, 1956. 
(Cities with a Population of 50,000 to 100,000) 


1957 $1.00 


National Conference for City Directors 
of Health, Physical Education, and Recreation, Dec. 11-15, 1955. 
(Cities with a Population over 100,000) 


1956 $1.00 


Intramural Sports Physical Education 
for College Men and Women for College Men and Women 
Oct. 30-Nov. 2, 1955 Oct. 4-6, 1954 


1956 $1.00 1955 $1.00 


Health Education for A Forward Look in 
Prospective Teachers College Health Education 
January 8-10, 1956 January 8-13, 1956 


1956 $1.00 1956 $1.00 





ORDER FROM 
AAHPER, 1201-16th ST., N.W. WASHINGTON 6, D.C. 





NEW 
FITNESS 
SERIES 


FITNESS FOR YOUTH 


Booklets that will keep you up to date on fitness refer- 
ences and official action on fitness in our areas. 


i—Selected Fitness References 
1958 16 pages, 50¢ 


2—References on Facilities and 
Equipment 
1958 : 20 pages, 75¢ 


3—Exercise and Fitness 
Joint Statement by AMA and AAHPER 


1958 8 pages, 25¢ 


Others To Be Announced 


Discounts on quantity orders: 2-9 copies, 10%; 10 or 
more copies, 20%. 





Order from 


AAHPER, 1201 Sixteenth Street, N.W., 
Washington 6, D. C. 





